Saving HTML data in a field using BLOB or TEXT/CLOB

HopeKing :

I need to store a page of html data in a field called ProjectDescription in mySQL database using Spring & JPA 2.1 . I have read this question and all other questions with BLOB tag, but need some clarity on why the fields are saving in a certain manner in my db. I have created the fields in the manner below using both String and byte[] field types.

Method1: Saving data as TEXT (After encoding in Base64 format I save my html data as String using the below method)

@Basic(fetch = FetchType.LAZY) 
@Lob
private String projectDescription = "";

Method2: Saving data as binary using BLOB

@Basic(fetch = FetchType.LAZY)
@Lob
@Column(length=5000)
private byte[]  projectDescription1 =new byte[0];

My assumptions: I assume that since a page of html data is not very large, TEXT is ok as comapred to BLOB

I tested both and the fields are saved as below in mySQL database

In Method1:

  1. Type: TEXT
  2. DisplaySize is constantly 1431655765.

This size doesn't change irrespective of my @Column(length=5000) annotaton.

In Method2

  1. Type: BLOB
  2. DisplaySize: -1

Question1: What is the source of this DisplaySize ? This seems quite large in case of TEXT and very small(-1) in case of byte[] field type. Why doesn't @Column length seem to change change the DisplaySize.

Question2: Is it ok to store HTML data as String field type(eventually as TEXT) as opposed to byte[] (eventually as a blob)?

Note: I have read all the questions with BLOB tags and am clear that images/documents need to be saved as BLOB and text as as CLOB/TEXT. However, would like to confirm again for HTML data given how large DisplaySize is allocated in DB for TEXT.

Thanks.

Rick James :

If it is a whole page, why go through the extra layer of fetching from a database table? If it is only part of a page, then I recommend TEXT CHARACTER SET utf8mb4. Any non-UTF-8 text in the page will cause trouble; may as well catch it early.

And the database industry is converging on using UTF-8 for all text.

Base64 is 8/6 times as big. And all it is doing is avoiding trouble with non-UTF-8 characters that should not be there. If anything, compress it in the client and store in a BLOB, thereby shrinking 3/1.

In MySQL, TEXT is 64K bytes. You might be better off with MEDIUMTEXT, which has a limit of 16MB. I say "bytes" because, for example, Chinese needs 3, sometimes 4, bytes per character, so only about 25K characters of Chinese text will fit in TEXT.

"DisplaySize is constantly 1431655765" -- What??? A gigabyte for a web page; no way! Even if that includes images (which it should not), that is quite unreasonable. Edit: eggyal's Comment on 2^32/3 probably explains this odd number.

In MySQL, SELECT length(my_text) ... will get the number of bytes in that column.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=463407&siteId=1