国产av日韩一区二区三区精品,成人性爱视频在线观看,国产,欧美,日韩,一区,www.成色av久久成人,2222eeee成人天堂

Home Java JavaBase Java implements obtaining the character encoding of a text file

Java implements obtaining the character encoding of a text file

Dec 23, 2019 am 11:49 AM
java Character Encoding accomplish text file Obtain

Java implements obtaining the character encoding of a text file

1. Understanding character encoding:

1. The default encoding of String in Java is UTF-8, which can be obtained using the following statement: Charset.defaultCharset( );

2. Under the Windows operating system, the default encoding of text files is ANSI, which is GBK for Chinese Windows. For example, if we use the Notepad program to create a new text document, its default character encoding is ANSI.

3. Text text documents have four encoding options: ANSI, Unicode (including Unicode Big Endian and Unicode Little Endian), UTF-8, UTF-16

4, so we read txt files may sometimes not know their encoding format, so a program needs to be used to dynamically determine the encoding of the txt file.

ANSI : No format definition, for Chinese operating systems it is GBK or GB2312

UTF-8 : The first three bytes are: 0xE59B9E (UTF-8), 0xEFBBBF (UTF-8 inclusive BOM)

UTF-16: The first two bytes are: 0xFEFF

Unicode: The first two bytes are: 0xFFFE

For example: Unicode documents start with 0xFFFE, use The program just takes out the first few bytes and makes a judgment.

5. Correspondence between Java encoding and Text encoding:

Java implements obtaining the character encoding of a text file

Java reads Text files. If the encoding format does not match, garbled characters will appear. Therefore, you need to set the correct character encoding when reading text files. The encoding format of Text documents is written in the file header. In the program, the encoding format of the file needs to be parsed first. After obtaining the encoding format, reading the file in this format will avoid garbled characters.

Free online video tutorial recommendation: java learning

2. For example:

There is a text file: test.txt

Java implements obtaining the character encoding of a text file

Test code:

/**
 * 文件名:CharsetCodeTest.java
 * 功能描述:文件字符編碼測(cè)試
 */
 
import java.io.*;
 
public class CharsetCodeTest {
    public static void main(String[] args) throws Exception {
        String filePath = "test.txt";
        String content = readTxt(filePath);
        System.out.println(content);
    }
 
 
public static String readTxt(String path) {
        StringBuilder content = new StringBuilder("");
        try {
            String fileCharsetName = getFileCharsetName(path);
            System.out.println("文件的編碼格式為:"+fileCharsetName);
 
            InputStream is = new FileInputStream(path);
            InputStreamReader isr = new InputStreamReader(is, fileCharsetName);
            BufferedReader br = new BufferedReader(isr);
 
            String str = "";
            boolean isFirst = true;
            while (null != (str = br.readLine())) {
                if (!isFirst)
                    content.append(System.lineSeparator());
                    //System.getProperty("line.separator");
                else
                    isFirst = false;
                content.append(str);
            }
            br.close();
        } catch (Exception e) {
            e.printStackTrace();
            System.err.println("讀取文件:" + path + "失敗!");
        }
        return content.toString();
    }
 
 
    public static String getFileCharsetName(String fileName) throws IOException {
        InputStream inputStream = new FileInputStream(fileName);
        byte[] head = new byte[3];
        inputStream.read(head);
 
        String charsetName = "GBK";//或GB2312,即ANSI
        if (head[0] == -1 && head[1] == -2 ) //0xFFFE
            charsetName = "UTF-16";
        else if (head[0] == -2 && head[1] == -1 ) //0xFEFF
            charsetName = "Unicode";//包含兩種編碼格式:UCS2-Big-Endian和UCS2-Little-Endian
        else if(head[0]==-27 && head[1]==-101 && head[2] ==-98)
            charsetName = "UTF-8"; //UTF-8(不含BOM)
        else if(head[0]==-17 && head[1]==-69 && head[2] ==-65)
            charsetName = "UTF-8"; //UTF-8-BOM
 
        inputStream.close();
 
        //System.out.println(code);
        return charsetName;
    }
}

Running results:

Java implements obtaining the character encoding of a text file

Recommended related articles and tutorials: Getting started with java

The above is the detailed content of Java implements obtaining the character encoding of a text file. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

PHP Tutorial
1502
276
How to handle transactions in Java with JDBC? How to handle transactions in Java with JDBC? Aug 02, 2025 pm 12:29 PM

To correctly handle JDBC transactions, you must first turn off the automatic commit mode, then perform multiple operations, and finally commit or rollback according to the results; 1. Call conn.setAutoCommit(false) to start the transaction; 2. Execute multiple SQL operations, such as INSERT and UPDATE; 3. Call conn.commit() if all operations are successful, and call conn.rollback() if an exception occurs to ensure data consistency; at the same time, try-with-resources should be used to manage resources, properly handle exceptions and close connections to avoid connection leakage; in addition, it is recommended to use connection pools and set save points to achieve partial rollback, and keep transactions as short as possible to improve performance.

How to work with Calendar in Java? How to work with Calendar in Java? Aug 02, 2025 am 02:38 AM

Use classes in the java.time package to replace the old Date and Calendar classes; 2. Get the current date and time through LocalDate, LocalDateTime and LocalTime; 3. Create a specific date and time using the of() method; 4. Use the plus/minus method to immutably increase and decrease the time; 5. Use ZonedDateTime and ZoneId to process the time zone; 6. Format and parse date strings through DateTimeFormatter; 7. Use Instant to be compatible with the old date types when necessary; date processing in modern Java should give priority to using java.timeAPI, which provides clear, immutable and linear

Comparing Java Frameworks: Spring Boot vs Quarkus vs Micronaut Comparing Java Frameworks: Spring Boot vs Quarkus vs Micronaut Aug 04, 2025 pm 12:48 PM

Pre-formanceTartuptimeMoryusage, Quarkusandmicronautleadduetocompile-Timeprocessingandgraalvsupport, Withquarkusoftenperforminglightbetterine ServerLess scenarios.2.Thyvelopecosyste,

Understanding Network Ports and Firewalls Understanding Network Ports and Firewalls Aug 01, 2025 am 06:40 AM

Networkportsandfirewallsworktogethertoenablecommunicationwhileensuringsecurity.1.Networkportsarevirtualendpointsnumbered0–65535,withwell-knownportslike80(HTTP),443(HTTPS),22(SSH),and25(SMTP)identifyingspecificservices.2.PortsoperateoverTCP(reliable,c

How does garbage collection work in Java? How does garbage collection work in Java? Aug 02, 2025 pm 01:55 PM

Java's garbage collection (GC) is a mechanism that automatically manages memory, which reduces the risk of memory leakage by reclaiming unreachable objects. 1.GC judges the accessibility of the object from the root object (such as stack variables, active threads, static fields, etc.), and unreachable objects are marked as garbage. 2. Based on the mark-clearing algorithm, mark all reachable objects and clear unmarked objects. 3. Adopt a generational collection strategy: the new generation (Eden, S0, S1) frequently executes MinorGC; the elderly performs less but takes longer to perform MajorGC; Metaspace stores class metadata. 4. JVM provides a variety of GC devices: SerialGC is suitable for small applications; ParallelGC improves throughput; CMS reduces

Comparing Java Build Tools: Maven vs. Gradle Comparing Java Build Tools: Maven vs. Gradle Aug 03, 2025 pm 01:36 PM

Gradleisthebetterchoiceformostnewprojectsduetoitssuperiorflexibility,performance,andmoderntoolingsupport.1.Gradle’sGroovy/KotlinDSLismoreconciseandexpressivethanMaven’sverboseXML.2.GradleoutperformsMaveninbuildspeedwithincrementalcompilation,buildcac

go by example defer statement explained go by example defer statement explained Aug 02, 2025 am 06:26 AM

defer is used to perform specified operations before the function returns, such as cleaning resources; parameters are evaluated immediately when defer, and the functions are executed in the order of last-in-first-out (LIFO); 1. Multiple defers are executed in reverse order of declarations; 2. Commonly used for secure cleaning such as file closing; 3. The named return value can be modified; 4. It will be executed even if panic occurs, suitable for recovery; 5. Avoid abuse of defer in loops to prevent resource leakage; correct use can improve code security and readability.

Using HTML `input` Types for User Data Using HTML `input` Types for User Data Aug 03, 2025 am 11:07 AM

Choosing the right HTMLinput type can improve data accuracy, enhance user experience, and improve usability. 1. Select the corresponding input types according to the data type, such as text, email, tel, number and date, which can automatically checksum and adapt to the keyboard; 2. Use HTML5 to add new types such as url, color, range and search, which can provide a more intuitive interaction method; 3. Use placeholder and required attributes to improve the efficiency and accuracy of form filling, but it should be noted that placeholder cannot replace label.

See all articles