Python memory, basic learning, data analysis

1. Memory

        In python, everything is an object. Python has been an object-oriented language since its design, and it has an important concept that everything is an object.

        Although Java is also an object-oriented programming language, its pedigree is not as pure as Python. For example, int, one of the eight basic data types in Java, needs to be packaged into an Integer class object when it is persisted . But in python, everything is an object. Numbers, strings, tuples, lists, dictionaries, functions, methods, classes, modules, etc. are all objects, including your code.

concept of object

        Different programming languages ​​define "object" in different ways. In some languages, it means that all objects must have properties and methods; in others, it means that all objects can be subclassed.

        In Python, the definition is loose, some objects have neither attributes nor methods, and not all objects are subclassable. But Python's everything is an object can be interpreted perceptually as: everything in Python can be assigned to a variable or passed to a function as a parameter.

All objects in Python have three attributes: (id(), type(), value)

Identity: Each object has a unique identity to identify itself. The identity of any object can be obtained by using the built-in function id(). You can simply think that this value is the memory address of the object.

Type: The type of an object determines what type of value the object can hold, what properties and methods it has, what operations it can perform, and what rules it follows. You can use the built-in function type() to see the type of an object.

Value: the data represented by the object

The "identity", "type" and "value" are assigned when all objects are created. If the object supports update operations, its value is mutable, otherwise it is read-only (numbers, strings, tuples, etc. are immutable). These three properties exist as long as the object exists.

        Technically speaking, each object has two standard header information, a type identifier to identify the type, and a reference counter to determine whether the object needs to be recycled.

C, C++ or Java, the value is stored in memory and the variable points to that memory location. ( heap memory, stack memory )

In C language and python, variables are stored in the following form:

        When pythpn stores data, it needs to consume a certain amount of memory to store information related to the data. And this information is written in C language, and when we modify the value of the variable, it is equivalent to recreating a variable, and it will automatically run the underlying code of C to update all the information of the data, and these underlying codes are very complicated , we don’t need to write it, but it has been written at the beginning of python design, we only need to perform the operation of variable assignment and it will be ok. This is also one of the useful points of python.

        So whenever you create a variable (say a = 200), a new PyObject is created in memory with its ref count set to 1, and the variable "a" points to it.

PyObjects in memory:

  • Type: Integer, String, Float, etc.
  • Reference count: the number of references bound to the object ref count
  • Value: value/data/information

But what is a ref count?

Let's take an example to understand it. We have a variable "a" of type integer with value 200. Let's say I need another variable called "b" of type integer with a value of 200. You have created two variables like this

So you've created two variables like this

a = 200 b = 200

Now, you are probably guessing that for variables "a" and "b", there must be 2 objects in memory. but it is not the truth. "a" and "b" point to the same object.

assign new object to variable

When assigning a new object to an existing variable. The ref count of the previous object is decremented by 1.

>>> a=1
>>> id(a)
94147440556736
>>> a=2
>>> id(a)
94147440556768
>>>

Now, back to the previous question, what happens when an object has a ref count of 0. Does it stay in memory?

Once an object's ref count becomes 0, it is removed from memory by the garbage collector.

Objects are equal

The == operator is used to test whether the values ​​​​of two referenced objects are equal
is is used to compare whether the two referenced objects are the same object

When the operation object is a smaller number or a shorter string, it is different:

This is caused by Python's caching mechanism, small numbers and strings are cached and reused, so a and b point to the same object

------------------------------------------------------------------------------------------------------------------------------

2. Basic

application:

python lists and Numpy arrays

Difference: (188 messages) Difference between python list and Numpy array_difference between numpy array and list_herr_whf's blog-CSDN blog

1 Both can be used to process multidimensional arrays.

        The ndarray object in Numpy is used to handle multidimensional arrays, it acts as a fast and flexible container for large data.

        Python lists can store one-dimensional arrays, and multi-dimensional arrays can be realized by nesting lists.

2 storage efficiency and input and output performance are different.

        Numpy is specially designed for the operation and operation of arrays. The storage efficiency and input and output performance are much better than the nested lists in Python. The larger the array, the more obvious the advantages of Numpy.

3-element data type.

        In general, all elements in a Numpy array must be of the same type,

        While the element type in a Python list is arbitrary,

        So in terms of general performance, Numpy arrays are not as good as Python lists,

        But in scientific computing, many loop statements can be saved, and the code usage is much simpler than the Python list.

1. python list

2. Numpy array

3. Object

        class (class: defines the properties and methods common to each object in the collection

                    

                Class variables: are common across instantiated objects. Class variables are defined within the class and outside the body of the function. Class variables are generally not used as instance variables. The instantiated objects of all classes share class variables at the same time, that is to say, class variables exist as common resources in all instantiated objects

                Local variables: Variables defined in the method only apply to the class of the current instance.

                Instance variables: in the class body, inside all functions: variables defined in the form of "self. variable name"

                Class method: A function defined in a class.

                )

        Class method, class variable : There are two calling methods, either directly using the class name or using the instantiated object of the class. The class name can not only call the class variable, but also modify its value. Because the class variable is shared by all instantiated objects, modifying the value of the class variable through the class name will affect all instantiated objects. Modifying class variables by class name will affect all instantiated objects. ( equivalent to python's static variable static )

        It is worth mentioning that, in addition to accessing class variables through class names, you can also dynamically add class variables to classes and objects. For example, on the basis of the CLanguage class, add the following code:

clang = CLanguage()
CLanguage.catalog = 13
print(clang.catalog)

        Instance variable: refers to a variable defined in the form of "self. variable name" inside any class method, and its characteristic is that it only acts on the object that calls the method.

        Class variables can be accessed through the (class) object, but the value of the class variable cannot be modified. Modifying the value of a class variable through a class object is not assigning a value to a class variable, but defining a new instance variable . Only through the class name can the class variable value be modified

        Local variables: can only be used in the function where the function is executed. After the function is executed, the local variables will also be destroyed

        

        Method rewriting: If the method inherited from the parent class cannot meet the needs of the subclass, it can be rewritten. This process is called method override, also known as method rewriting.

        

        Inheritance: That is, a derived class (derived class) inherits the fields and methods of the base class (base class). Inheritance also allows an object of a derived class to be treated as an object of a base class. For example, there is such a design: an object of type Dog is derived from the Animal class, which simulates the "is-a" relationship (for example, Dog is an Animal).

  • The first method, the __init__() method, is a special method called the constructor or initialization method of a class that is called when an instance of the class is created

  • Class methods have only one special difference from ordinary functions - they must have an additional first parameter name , which by convention is called self

  • self represents the instance of the class, representing the address of the current object, while self.__class__ points to the class

          One of the main benefits of object -  oriented programming is the reuse of code, and one of the ways to achieve this reuse is through the inheritance mechanism.


data analysis:

What is data analysis: refers to the collection of large amounts of data and the use of appropriate analysis methods for analysis, and the use of efficient analysis tools to classify and summarize them. The process of extracting the most valuable information, summarizing and forming effective conclusions, and mining the maximum value of data.

Tools and Skills:

excel: Not an important skill but everyone knows it.

sql: very important

spss: Professional statistical analysis software (not everyone uses it), no need to be proficient

Python: In the face of a large amount of complex data, use sql for automated operations

Business data analysis:

 

 Python data analysis:

EXCEL:

1. As a member of office software, it has been bundled and sold together in office

2. WPS Office and Microsoft Office are currently the two most popular office software in China. Microsoft Office is an office suite developed by Microsoft. WPSOffice is an office suite developed by Kingsoft Co., Ltd.

3. WPS contains a lot of convenient kits. Both are free to use. Including the most common Word, Excel, PPT. Also has PDF reading. Documentation fixes. Powerword, notes, mail and other packages. But Ass (database) is not supported. small volume. 'Cloud Office'. Open fast. Installs quickly. Most compatible platforms! Supports Linux platforms that Office does not support. Professional pertinence is not strong enough. Poor compatibility. Because it is free, there are many advertisements (you can solve the problem by charging money). Opening docx, xlsx, and pptx will freeze. If the file is sent to the office to open, there will be a problem of document drift.

4. Office contains a lot of professional suites. There is a fee to use it. The price is also extremely high. Including the most common three suites (the most common Word, Excel, PPT). The latest version of Word supports opening PDF and also supports writing. But the most recommended is Adobe Reader pro as the daily use of PDF. This is off topic. Next time, I will talk about why PDF editors mainly recommend this one. The Office desktop version also includes Access (database), Onenote (notebook), Outlook (mailbox), Publisher (page layout), Skype (voice call), OneDrive (cloud storage) and other professional supporting software. Strong professional pertinence, good compatibility, editable PDF, smooth and uninterrupted ppt. It can even cut out pictures, and Excel can connect to servers and data.

5. System Compatibility: WPS supports a real full platform and is compatible with the great Linux (open source system). Office only supports ordinary Windows, Mac, Android, and ios. But it can also be called a full platform. However, the functions of office on the Mac are very much castrated. On the one hand, Apple does not want Microsoft to do too well in his house, and on the other hand, Microsoft does not want to do too well in other people's houses. As a result, customers on both sides are not very good at using Mac to work with Office. The Windows platform mainly promotes Office.

Guess you like

Origin blog.csdn.net/zr_xs/article/details/131097729