国产av日韩一区二区三区精品,成人性爱视频在线观看,国产,欧美,日韩,一区,www.成色av久久成人,2222eeee成人天堂

Home Database Mysql Tutorial 用于大數(shù)據(jù)的并查集(基于HBase)的java類

用于大數(shù)據(jù)的并查集(基于HBase)的java類

Jun 07, 2016 pm 04:28 PM
hbase java based on recommend data system

在做推薦系統(tǒng)的時候想查看原始數(shù)據(jù)集中自然存在的類別有多少種,即找到一些子集,這些子集屬于原始數(shù)據(jù)集,子集之間沒有任何關(guān)聯(lián),而子集內(nèi)部所有數(shù)據(jù)都有直接或間接的關(guān)聯(lián)。 首先考慮的是由于數(shù)據(jù)規(guī)模,讀入內(nèi)存是不可能的,所以要借助硬盤(雖然很不情愿)

在做推薦系統(tǒng)的時候想查看原始數(shù)據(jù)集中自然存在的類別有多少種,即找到一些子集,這些子集屬于原始數(shù)據(jù)集,子集之間沒有任何關(guān)聯(lián),而子集內(nèi)部所有數(shù)據(jù)都有直接或間接的關(guān)聯(lián)。

首先考慮的是由于數(shù)據(jù)規(guī)模,讀入內(nèi)存是不可能的,所以要借助硬盤(雖然很不情愿)。既然是借助硬盤,那就要文件存取。而又由于在處理過程中需要快速的查找數(shù)據(jù)是否存在于某個集合內(nèi)和將數(shù)據(jù)集合關(guān)聯(lián)等操作,選擇使用并查集。

這樣選擇之后算是有一個解決方案了,但是還需要最后一個關(guān)鍵的部分,就是需要建立文件索引和緩存機制以便快速進行合并和查詢過程。這里選擇使用的工具還是最趁手的hbase,很好的解決這兩個問題。

這個類主要解決的問題就是原始數(shù)據(jù)的聚類,有關(guān)聯(lián)的聚在一起。核心的兩個方法是:

public byte[] findSet(byte[] pos);
public void union(byte[] pos1, byte[] pos2);

其中還有一個

public byte[] findSet(byte[] pos)

是遞歸實現(xiàn)。兩個方法都使用了路徑壓縮進行優(yōu)化。union()方法的兩個參數(shù)有順序要求,其作用是后者集合連接到前者集合的根節(jié)點。

最后,計算的并行是使用MapReduce計算框架。

package recommendsystem;
?
import java.io.IOException;
import java.lang.reflect.Array;
?
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.util.Bytes;
?
public class UnionFindSet {
	private Configuration _conf;
	private HBaseAdmin _hbAdmin;
	private HTable _unionTable;
?
	public static void main(String[] args) throws IOException {
		UnionFindSet ufs = new UnionFindSet("test");
		ufs.union(Bytes.toBytes("7"), Bytes.toBytes("8"));
		ufs.union(Bytes.toBytes("5"), Bytes.toBytes("9"));
		ufs.union(Bytes.toBytes("3"), Bytes.toBytes("7"));
		ufs.union(Bytes.toBytes("4"), Bytes.toBytes("6"));
		ufs.union(Bytes.toBytes("1"), Bytes.toBytes("7"));
		for (int i = 1; i 
    <p class="copyright">
        原文地址:用于大數(shù)據(jù)的并查集(基于HBase)的java類, 感謝原作者分享。
    </p>
    
    


Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to iterate over a Map in Java? How to iterate over a Map in Java? Jul 13, 2025 am 02:54 AM

There are three common methods to traverse Map in Java: 1. Use entrySet to obtain keys and values at the same time, which is suitable for most scenarios; 2. Use keySet or values to traverse keys or values respectively; 3. Use Java8's forEach to simplify the code structure. entrySet returns a Set set containing all key-value pairs, and each loop gets the Map.Entry object, suitable for frequent access to keys and values; if only keys or values are required, you can call keySet() or values() respectively, or you can get the value through map.get(key) when traversing the keys; Java 8 can use forEach((key,value)-&gt

Java Optional example Java Optional example Jul 12, 2025 am 02:55 AM

Optional can clearly express intentions and reduce code noise for null judgments. 1. Optional.ofNullable is a common way to deal with null objects. For example, when taking values ??from maps, orElse can be used to provide default values, so that the logic is clearer and concise; 2. Use chain calls maps to achieve nested values ??to safely avoid NPE, and automatically terminate if any link is null and return the default value; 3. Filter can be used for conditional filtering, and subsequent operations will continue to be performed only if the conditions are met, otherwise it will jump directly to orElse, which is suitable for lightweight business judgment; 4. It is not recommended to overuse Optional, such as basic types or simple logic, which will increase complexity, and some scenarios will directly return to nu.

Comparable vs Comparator in Java Comparable vs Comparator in Java Jul 13, 2025 am 02:31 AM

In Java, Comparable is used to define default sorting rules internally, and Comparator is used to define multiple sorting logic externally. 1.Comparable is an interface implemented by the class itself. It defines the natural order by rewriting the compareTo() method. It is suitable for classes with fixed and most commonly used sorting methods, such as String or Integer. 2. Comparator is an externally defined functional interface, implemented through the compare() method, suitable for situations where multiple sorting methods are required for the same class, the class source code cannot be modified, or the sorting logic is often changed. The difference between the two is that Comparable can only define a sorting logic and needs to modify the class itself, while Compar

How to fix java.io.NotSerializableException? How to fix java.io.NotSerializableException? Jul 12, 2025 am 03:07 AM

The core workaround for encountering java.io.NotSerializableException is to ensure that all classes that need to be serialized implement the Serializable interface and check the serialization support of nested objects. 1. Add implementsSerializable to the main class; 2. Ensure that the corresponding classes of custom fields in the class also implement Serializable; 3. Use transient to mark fields that do not need to be serialized; 4. Check the non-serialized types in collections or nested objects; 5. Check which class does not implement the interface; 6. Consider replacement design for classes that cannot be modified, such as saving key data or using serializable intermediate structures; 7. Consider modifying

How to handle character encoding issues in Java? How to handle character encoding issues in Java? Jul 13, 2025 am 02:46 AM

To deal with character encoding problems in Java, the key is to clearly specify the encoding used at each step. 1. Always specify encoding when reading and writing text, use InputStreamReader and OutputStreamWriter and pass in an explicit character set to avoid relying on system default encoding. 2. Make sure both ends are consistent when processing strings on the network boundary, set the correct Content-Type header and explicitly specify the encoding with the library. 3. Use String.getBytes() and newString(byte[]) with caution, and always manually specify StandardCharsets.UTF_8 to avoid data corruption caused by platform differences. In short, by

Java method references explained Java method references explained Jul 12, 2025 am 02:59 AM

Method reference is a way to simplify the writing of Lambda expressions in Java, making the code more concise. It is not a new syntax, but a shortcut to Lambda expressions introduced by Java 8, suitable for the context of functional interfaces. The core is to use existing methods directly as implementations of functional interfaces. For example, System.out::println is equivalent to s->System.out.println(s). There are four main forms of method reference: 1. Static method reference (ClassName::staticMethodName); 2. Instance method reference (binding to a specific object, instance::methodName); 3.

JavaScript Data Types: Primitive vs Reference JavaScript Data Types: Primitive vs Reference Jul 13, 2025 am 02:43 AM

JavaScript data types are divided into primitive types and reference types. Primitive types include string, number, boolean, null, undefined, and symbol. The values are immutable and copies are copied when assigning values, so they do not affect each other; reference types such as objects, arrays and functions store memory addresses, and variables pointing to the same object will affect each other. Typeof and instanceof can be used to determine types, but pay attention to the historical issues of typeofnull. Understanding these two types of differences can help write more stable and reliable code.

How to parse JSON in Java? How to parse JSON in Java? Jul 11, 2025 am 02:18 AM

There are three common ways to parse JSON in Java: use Jackson, Gson, or org.json. 1. Jackson is suitable for most projects, with good performance and comprehensive functions, and supports conversion and annotation mapping between objects and JSON strings; 2. Gson is more suitable for Android projects or lightweight needs, and is simple to use but slightly inferior in handling complex structures and high-performance scenarios; 3.org.json is suitable for simple tasks or small scripts, and is not recommended for large projects because of its lack of flexibility and type safety. The choice should be decided based on actual needs.

See all articles