java怎么解決亂碼
java在字符串中統(tǒng)一用Unicode表示。
對于任意一個字符串:String string = “測試字符串”;
如果源文件是GBK編碼,操作系統(tǒng)默認環(huán)境編碼也為GBK,那么編譯的時候,JVM將按照GBK編碼將字節(jié)數(shù)組解析為字符,然后將字符轉(zhuǎn)換為Unicode格式的字節(jié)數(shù)組,作為內(nèi)部存儲(字節(jié)數(shù)組→字符→Unicode字節(jié)數(shù)組)(推薦教程:java教程)
當打印這個字符串時,JVM根據(jù)操作系統(tǒng)本地的語言環(huán)境,將Unicode轉(zhuǎn)換為GBK,然后操作系統(tǒng)將GBK格式的內(nèi)容顯示出來。
當源碼文件是UTF-8, 我們需要通知編譯器源碼的格式,javac -encoding utf-8 … , 編譯時,JVM按照utf-8 解析成字符,然后轉(zhuǎn)換為unicode格式的字節(jié)數(shù)組, 那么不論源碼文件是什么格式,同樣的字符串,最后得到的unicode字節(jié)數(shù)組是完全一致的,顯示的時候,也是轉(zhuǎn)成GBK來顯示(跟OS環(huán)境有關)
亂碼是如何產(chǎn)生的?
本質(zhì)上都是由于字符串原本的編碼格式與讀取時解析用的編碼格式不一致導致的。
亂碼指的是程序顯示出來的字符文本無法用任何語言去解讀。一般情況下會包含大量的?。亂碼問題是所有計算機用戶或多或少會遇到的問題。造成亂碼的原因就是因為使用了錯誤的字符編碼去解碼字節(jié)流,因此當我們在思考任何跟文本顯示有關的問題時,請時刻保持清醒:當前使用的字符編碼是什么。只有這樣,我們才能正確分析和處理亂碼問題。
例如最常見的網(wǎng)頁亂碼問題。如果你是網(wǎng)站技術人員,遇到這樣的問題,需要檢查以下原因:
●?服務器返回的響應頭Content-Type沒有指明字符編碼
●?網(wǎng)頁內(nèi)是否使用META HTTP-EQUIV標簽指定了字符編碼
●?網(wǎng)頁文件本身存儲時使用的字符編碼和網(wǎng)頁聲明的字符編碼是否一致
java代碼中的亂碼問題如何解決呢?
例如:String s = “測試字符串”;
System.out.println( new String(s.getBytes(),"UTF-8")); //錯誤,因為getBytes()默認使用GBK編碼, 而解析時使用UTF-8編碼,肯定出錯。
其中getBytes()是將Unicode轉(zhuǎn)換為操作系統(tǒng)默認格式的字節(jié)數(shù)組,即“測試字符串”的GBK格式,new String (bytes, Charset) 中的charset 是指定讀取byte的方式,這里指定為UTF-8,即把bytes的內(nèi)容當做UTF-8來讀取。
如下兩種方式得到的結(jié)果都是正確的,因為它們的源內(nèi)容編碼和解析用的編碼是一致的。
System.out.println( new String(s.getBytes(),"GBK")); System.out.println( new String(s.getBytes("UTF-8"),"UTF-8"));
那么,如何利用getBytes 和 new String() 來進行編碼轉(zhuǎn)換呢?
網(wǎng)上流傳著一種錯誤的方法:
GBK--> UTF-8: new String( s.getBytes("GBK") , "UTF-8);
這種方式是完全錯誤的,因為getBytes 的編碼與 UTF-8 不一致,肯定是亂碼。
但是為什么在tomcat 下,使用 new String(s.getBytes(“iso-8859-1”) ,”GBK”) 卻可以用呢?
答案是:
tomcat 默認使用iso-8859-1編碼, 也就是說,如果原本字符串是GBK的,tomcat傳輸過程中,將GBK轉(zhuǎn)成iso-8859-1了,默認情況下,使用iso-8859-1讀取中文肯定是有問題的,那么我們需要將iso-8859-1 再轉(zhuǎn)成GBK, 而iso-8859-1 是單字節(jié)編碼的,即他認為一個字節(jié)是一個字符, 那么這種轉(zhuǎn)換不會對原來的字節(jié)數(shù)組做任何改變,因為字節(jié)數(shù)組本來就是由單個字節(jié)組成的,如果之前用GBK編碼,那么轉(zhuǎn)成iso-8859-1后編碼內(nèi)容完全沒變, 則 s.getBytes(“iso-8859-1”) 實際上還是原來GBK的編碼內(nèi)容則 new String(s.getBytes(“iso-8859-1”) ,”GBK”) 就可以正確解碼了。 所以說這是一種巧合。
如何正確的將GBK轉(zhuǎn)UTF-8 ? (實際上是unicode轉(zhuǎn)UTF-8)
//利用getBytes將unicode字符串轉(zhuǎn)成UTF-8格式的字節(jié)數(shù)組,然后用utf-8 對這個字節(jié)數(shù)組解碼成新的字符串 new String( s.getBytes("utf-8") , "utf-8");
UTF-8 轉(zhuǎn)GBK原理也是一樣
new String( s.getBytes("GBK") , "GBK");
其實核心工作都由getBytes(charset)做了。getBytes的JDK描述:Encoding this String into a sequence of bytes using the named charset,storing the result into a new byte array.
OutputStreamWriter w1 = new OutputStreamWriter(new FileOutputStream("D:\\file1.txt"),"UTF-8"); InputStreamReader( stream, charset)
可以幫助我們輕松的按照指定編碼讀寫文件。
The above is the detailed content of How to solve garbled characters in java. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

To correctly handle JDBC transactions, you must first turn off the automatic commit mode, then perform multiple operations, and finally commit or rollback according to the results; 1. Call conn.setAutoCommit(false) to start the transaction; 2. Execute multiple SQL operations, such as INSERT and UPDATE; 3. Call conn.commit() if all operations are successful, and call conn.rollback() if an exception occurs to ensure data consistency; at the same time, try-with-resources should be used to manage resources, properly handle exceptions and close connections to avoid connection leakage; in addition, it is recommended to use connection pools and set save points to achieve partial rollback, and keep transactions as short as possible to improve performance.

TheJVMenablesJava’s"writeonce,runanywhere"capabilitybyexecutingbytecodethroughfourmaincomponents:1.TheClassLoaderSubsystemloads,links,andinitializes.classfilesusingbootstrap,extension,andapplicationclassloaders,ensuringsecureandlazyclassloa

Use classes in the java.time package to replace the old Date and Calendar classes; 2. Get the current date and time through LocalDate, LocalDateTime and LocalTime; 3. Create a specific date and time using the of() method; 4. Use the plus/minus method to immutably increase and decrease the time; 5. Use ZonedDateTime and ZoneId to process the time zone; 6. Format and parse date strings through DateTimeFormatter; 7. Use Instant to be compatible with the old date types when necessary; date processing in modern Java should give priority to using java.timeAPI, which provides clear, immutable and linear

Pre-formanceTartuptimeMoryusage, Quarkusandmicronautleadduetocompile-Timeprocessingandgraalvsupport, Withquarkusoftenperforminglightbetterine ServerLess scenarios.2.Thyvelopecosyste,

Java's garbage collection (GC) is a mechanism that automatically manages memory, which reduces the risk of memory leakage by reclaiming unreachable objects. 1.GC judges the accessibility of the object from the root object (such as stack variables, active threads, static fields, etc.), and unreachable objects are marked as garbage. 2. Based on the mark-clearing algorithm, mark all reachable objects and clear unmarked objects. 3. Adopt a generational collection strategy: the new generation (Eden, S0, S1) frequently executes MinorGC; the elderly performs less but takes longer to perform MajorGC; Metaspace stores class metadata. 4. JVM provides a variety of GC devices: SerialGC is suitable for small applications; ParallelGC improves throughput; CMS reduces

Networkportsandfirewallsworktogethertoenablecommunicationwhileensuringsecurity.1.Networkportsarevirtualendpointsnumbered0–65535,withwell-knownportslike80(HTTP),443(HTTPS),22(SSH),and25(SMTP)identifyingspecificservices.2.PortsoperateoverTCP(reliable,c

defer is used to perform specified operations before the function returns, such as cleaning resources; parameters are evaluated immediately when defer, and the functions are executed in the order of last-in-first-out (LIFO); 1. Multiple defers are executed in reverse order of declarations; 2. Commonly used for secure cleaning such as file closing; 3. The named return value can be modified; 4. It will be executed even if panic occurs, suitable for recovery; 5. Avoid abuse of defer in loops to prevent resource leakage; correct use can improve code security and readability.

ExecutorService is suitable for asynchronous execution of independent tasks, such as I/O operations or timing tasks, using thread pool to manage concurrency, submit Runnable or Callable tasks through submit, and obtain results with Future. Pay attention to the risk of unbounded queues and explicitly close the thread pool; 2. The Fork/Join framework is designed for split-and-governance CPU-intensive tasks, based on partitioning and controversy methods and work-stealing algorithms, and realizes recursive splitting of tasks through RecursiveTask or RecursiveAction, which is scheduled and executed by ForkJoinPool. It is suitable for large array summation and sorting scenarios. The split threshold should be set reasonably to avoid overhead; 3. Selection basis: Independent
