[转][Translation] Android Ice Cream Sandwich ICS (4.0+) JNI partial reference changes

https://blog.k-res.net/archives/1525.html

Translation order:

The content of this article was actually found when I was looking for a solution after I found a project bug. At that time, the original target of the project was 8 (the 2.X version before ICS 4.0), and everything was running normally on S3 4.0+, and then When the target is upgraded to 14 and then run on S3, a native crash similar to the following will appear:

05-13 14:07:13.139: E/dalvikvm(22265): JNI ERROR (app bug): attempt to use stale local reference 0x20d00001
05-13 14:07:13.139: E/dalvikvm(22265): VM aborting
05-13 14:07:13.139: A/libc(22265): Fatal signal 11 (SIGSEGV) at 0xdeadd00d (code=1), thread 22457 (Thread-1276)
05-13 14:07:13.239: I/DEBUG(1894): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
05-13 14:07:13.249: I/DEBUG(1894): Build fingerprint: ‘samsung/m0zc/m0chn:4.1.2/JZO54K/I9300ZCEMB1:user/release-keys’
05-13 14:07:13.249: I/DEBUG(1894): pid: 22265, tid: 22457, name: Thread-1276 >>> cn.android.app <<<
05-13 14:07:13.249: I/DEBUG(1894): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr deadd00d
05-13 14:07:13.489: I/DEBUG(1894): r0 00000000 r1 00000000 r2 deadd00d r3 00000000
05-13 14:07:13.489: I/DEBUG(1894): r4 408cb1a8 r5 0000020c r6 20d00001 r7 fffff86c
05-13 14:07:13.489: I/DEBUG(1894): r8 5ee308dc r9 00004e58 sl fffff870 fp 5ee307b8
05-13 14:07:13.489: I/DEBUG(1894): ip 00004000 sp 5ee30540 lr 400f7c95 pc 40866e50 cpsr 60000030
05-13 14:07:13.489: I/DEBUG(1894): d0 3ff000003f800000 d1 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d2 0000000000000000 d3 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d4 0000000000000000 d5 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d6 0000000000000000 d7 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d8 0000000000000000 d9 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d10 0000000000000000 d11 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d12 0000000000000000 d13 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d14 0000000000000000 d15 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d16 0000000000000000 d17 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d18 0000000000000000 d19 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d20 0000000000000000 d21 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d22 0000000000000000 d23 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d24 0000000000000000 d25 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d26 0000000000000000 d27 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d28 0000000000000000 d29 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): d30 0000000000000000 d31 0000000000000000
05-13 14:07:13.489: I/DEBUG(1894): scr 60000010
05-13 14:07:13.499: I/DEBUG(1894): backtrace:
05-13 14:07:13.499: I/DEBUG(1894): #00 pc 00045e50 /system/lib/libdvm.so (dvmAbort+75)
05-13 14:07:13.499: I/DEBUG(1894): #01 pc 00028c3c /system/lib/libdvm.so (IndirectRefTable::get(void*) const+336)
05-13 14:07:13.499: I/DEBUG(1894): #02 pc 00049eeb /system/lib/libdvm.so (dvmDecodeIndirectRef(Thread*, _jobject*)+30)
05-13 14:07:13.499: I/DEBUG(1894): #03 pc 0004ca77 /system/lib/libdvm.so
05-13 14:07:13.499: I/DEBUG(1894): #04 pc 00653480 /data/data/cn.android.app/lib/libgameapp.so (CKSoundManager::LoadBGM(char const*)+56)
05-13 14:07:13.509: I/DEBUG(1894): memory map around fault addr deadd00d:
05-13 14:07:13.509: I/DEBUG(1894): be9ae000-be9cf000 [stack]
05-13 14:07:13.509: I/DEBUG(1894): (no map for address)
05-13 14:07:13.509: I/DEBUG(1894): ffff0000-ffff1000 [vectors]
05-13 14:07:13.674: I/DEBUG(1894): !@dumpstate -k -t -z -d -o /data/log/dumpstate_app_native -m 22265

The more critical tips in the above crash content are attempt to use stale local reference and dvmDecodeIndirectRef on the call stack, which actually refers to the error in the reference to part of the Java object when the JNI is called. According to the key content, it looks like it was written by the android dalvik team developer. According to a related article, the non-rigorous JNI code was corrected smoothly according to the explanation, and the problem was solved! I feel it is necessary to translate the full text to deepen my understanding (due to my limited level, please point out any improper translation!):

text:

[The author of this article is Elliott Hughes, a software engineer in the Dalvik team. -Tim Bray]

If you don't write native code that uses JNI, then this article is of little use to you. If you write, then you should really read this article.

What has changed? Why?

Every developer wants a good garbage collector (garbage collector, referred to as GC). A good GC will move objects at any time. This can facilitate the provision of more efficient memory allocation and bulk memory reclamation, avoid heap memory fragmentation, and possibly improve locality. If you submit pointers to these objects to native code, moving objects at any time is a problem. JNI uses types like jobject to solve this problem: instead of submitting the pointer directly, it gives you a transparent handle (opaque handle, which is conceptually transparent to developers) that can be converted into an actual pointer when necessary. By using the handle, when the GC moves the object, it only needs to update the handle correspondence table to point to the new position of the object. This means that native code does not need to be left with a bunch of unusable pointers every time the GC runs.

In previous versions of Android, we did not use indirect handles; we used direct pointers. Since we didn't implement a GC that moves objects, this doesn't seem to be a big problem, but it will cause developers to write code that appears to be working properly but is actually buggy. In ICS, even if we still haven't implemented a GC that will move objects, we have switched to indirect references, so you will start to check for bugs in your native code.

ICS provides a JNI bug compatibility mode: as long as the targetSdkVersion version number in AndroidManifest.xml is lower than ICS (14-), your code can be "exempted". But once you update the targetSdkVersion, your code must be correct!

CheckJNI has been updated to detect and report these errors, and in ICS, if debug=”true” in the manifest, CheckJNI is already enabled by default.

Some basic knowledge cited by JNI

In JNI, there are a few different references. Two of the most important are local references and global references. Any given jobject can be local or global. (There are also weak global weak globals, but this has a separate type, jweak, which is not involved here.)

The global/local distinction affects both life cycle and scope. The global one can be used in any thread through the JNIEnv* of this thread, and it can be effective until DeleteGlobalRef() is explicitly called. Local can only be used in the thread to which it was originally submitted, and can be effective until DeleteLocalRef() is explicitly called, or, more generally, until you return from your native function. When the native function returns, all local references will be automatically deleted.

In the previous system, local references were direct pointers, and local references never really became unusable. That means you can use a local reference indefinitely, even if you have explicitly called DeleteLocalRef() on it, or explicitly deleted it using PopLocalFrame().

Although any JNIEnv can only be used in one thread, because Android has never saved the state of each thread in JNIEnv, it will not be a problem to use JNIEnv in the wrong thread before . Now that each thread has a local reference table, it is crucial to use JNIEnv in the correct thread .

The above is the bug that ICS will detect. I will go through some common examples to illustrate these problems, if they are found, and how to fix them. You do need to fix these problems, which is very important, because it is very likely that a future version of Android will add a recycler that can move objects. We cannot always provide bug compatibility mode.

Common JNI reference bugs

Bug: forget to call NewGlobalRef() when saving jobject for a long time in the native code interface class

If you use a native peer (native peer) (a long-lived native object corresponding to a Java object, usually created when the Java object is created and destroyed when the finalizer of the Java object is running), you must not store jobject in the native object for a long time , Because it will no longer be available the next time you use it. (JNIEnv* has a similar situation. It may still be available when the native call occurs in the same thread, otherwise it will not be available.)

 class MyPeer {
    
    
 public:
   MyPeer(jstring s) {
    
    
     str_ = s; // 错误: 没有确定是全局就长期保存引用
   }
   jstring str_;
 };

 static jlong MyClass_newPeer(JNIEnv* env, jclass) {
    
    
   jstring local_ref = env->NewStringUTF("hello, world!");
   MyPeer* peer = new MyPeer(local_ref);
   return static_cast<jlong>(reinterpret_cast<uintptr_t>(peer));
   // 错误: local_ref 在我们返回时将变得不再可用, 但我们已经将其保存在'peer'中了.
 }

 static void MyClass_printString(JNIEnv* env, jclass, jlong peerAddress) {
    
    
   MyPeer* peer = reinterpret_cast<MyPeer*>(static_cast<uintptr_t>(peerAddress));
   // 错误: peer->str_ 不可用!
   ScopedUtfChars s(env, peer->str_);
   std::cout << s.c_str() << std::endl;
 }

The solution to this problem is to save only the JNI global reference. Since JNI global references will never be automatically released, it is very important that you have to release them yourself. This problem will be a bit embarrassing because there is no JNIEnv* in your destructor. The simplest solution is usually to add a clear destruction function to your native interface class and call it in the finalizer of the Java interface class.

 class MyPeer {
    
    
 public:
   MyPeer(JNIEnv* env, jstring s) {
    
    
     this->s = env->NewGlobalRef(s);
   }
   ~MyPeer() {
    
    
     assert(s == NULL);
   }
   void destroy(JNIEnv* env) {
    
    
     env->DeleteGlobalRef(s);
     s = NULL;
   }
   jstring s;
 };

You should always keep the NewGlobalRef()/DeleteGlobalRef() paired calls. CheckJNI will catch the leak of global references, but the upper limit is very high (default 2000), so be careful.

If your code does have such errors, you will receive a crash message similar to this:

    JNI ERROR (app bug): accessed stale local reference 0x5900021 (index 8 in a table of size 8)
    JNI WARNING: jstring is an invalid local reference (0x5900021)
                 in LMyClass;.printString:(J)V (GetStringUTFChars)
    "main" prio=5 tid=1 RUNNABLE
      | group="main" sCount=0 dsCount=0 obj=0xf5e96410 self=0x8215888
      | sysTid=11044 nice=0 sched=0/0 cgrp=[n/a] handle=-152574256
      | schedstat=( 156038824 600810 47 ) utm=14 stm=2 core=0
      at MyClass.printString(Native Method)
      at MyClass.main(MyClass.java:13)

If you use JNIEnv* from another thread, you will receive a crash message similar to this:

 JNI WARNING: threadid=8 using env from threadid=1
                 in LMyClass;.printString:(J)V (GetStringUTFChars)
    "Thread-10" prio=5 tid=8 NATIVE
      | group="main" sCount=0 dsCount=0 obj=0xf5f77d60 self=0x9f8f248
      | sysTid=22299 nice=0 sched=0/0 cgrp=[n/a] handle=-256476304
      | schedstat=( 153358572 709218 48 ) utm=12 stm=4 core=8
      at MyClass.printString(Native Method)
      at MyClass$1.run(MyClass.java:15)

Bug: Wrong thinking that FindClass() returns a global reference

FindClass() returns a local reference. Many people think it's a big picture. In a system like Android that does not have class unloading, you can treat jfieldID and jmethodID as global processing. (They are not actually references, but similar life cycle issues exist in systems that support class unloading.) But jclass is a reference, and FindClass() returns a local reference. A common mistake is "static jclass". Unless you manually convert local references to global references, you will have problems with your code. Here is how to write the correct code:

 static jclass gMyClass;
 static jclass gSomeClass;

 static void MyClass_nativeInit(JNIEnv* env, jclass myClass) {
    
    
   // ‘myClass’ (和其他非主要参数) 仅仅是局部引用.
   gMyClass = env->NewGlobalRef(myClass);

   // FindClass仅返回局部引用.
   jclass someClass = env->FindClass("SomeClass");
   if (someClass == NULL) {
    
    
     return; // FindClass 已经抛出了 NoClassDefFoundError 的异常.
   }
   gSomeClass = env->NewGlobalRef(someClass);
 }

If your code does have such errors, you will receive a crash message like this:

    JNI ERROR (app bug): attempt to use stale local reference 0x4200001d (should be 0x4210001d)
    JNI WARNING: 0x4200001d is not a valid JNI reference
                 in LMyClass;.useStashedClass:()V (IsSameObject)

Bug: Continue to use the deleted reference after calling DeleteLocalRef()

I think it goes without saying that you should also know that calling DeleteLocalRef() to delete the reference and then use it will cause illegal access, but because this worked normally before, you may have made this mistake but haven't realized it yet. A common pattern is like this: the native code part has a long-running loop. Developers try to clean up every local reference in order to avoid reaching the upper limit of local references, but they may accidentally delete the reference that they want as the return value. Drop!

The solution is simple: don't call DeleteLocalRef() on the references you still need (including the return value).

Bug: Continue to use the popped reference after calling PopLocalFrame()

This is actually a subtle variant of the bug above. PushLocalFrame() and PopLocalFrame() calls can delete local references in batches. When calling PopLocalFrame(), you pass in a reference to the frame you want to keep as a parameter (usually to be used as the return value), or NULL. In the past, you will find that there will be no problems with error codes like this:

 static jobjectArray MyClass_returnArray(JNIEnv* env, jclass) {
    
    
   env->PushLocalFrame(256);
   jobjectArray array = env->NewObjectArray(128, gMyClass, NULL);
   for (int i = 0; i < 128; ++i) {
    
    
       env->SetObjectArrayElement(array, i, newMyClass(i));
   }
   env->PopLocalFrame(NULL); // 错误: 应当传递 'array'.
   return array; // 错误: 数组已经不可用.
 }

The solution is usually to pass the reference to PopLocalFrame(). Note that in the above example, you don't need to save references to individual array elements; as long as the GC knows the array itself, it will process the elements (and any objects they point to) themselves.

If your code does have such errors, you will receive a crash message like this:

  JNI ERROR (app bug): accessed stale local reference 0x2d00025 (index 9 in a table of size 8)
    JNI WARNING: invalid reference returned from native code
                 in LMyClass;.returnArray:()[Ljava/lang/Object;

to sum up

Yes, we ask you to pay more attention to some details when coding in JNI. This is extra work. But we think that as we make better and better memory management code, you can also get ahead.

Original (there is a wall!):

JNI Local Reference Changes in ICS
http://android-developers.blogspot.com/2011/11/jni-local-reference-changes-in-ics.html

Guess you like

Origin blog.csdn.net/hegan2010/article/details/105860295