国产av日韩一区二区三区精品,成人性爱视频在线观看,国产,欧美,日韩,一区,www.成色av久久成人,2222eeee成人天堂

Table of Contents
What is embedding?
Why choose PostgreSQL and pgvector?
App Overview
Prerequisites
Conclusion
Home Backend Development Golang Building a Semantic Search Engine with OpenAI, Go, and PostgreSQL (pgvector)

Building a Semantic Search Engine with OpenAI, Go, and PostgreSQL (pgvector)

Jan 15, 2025 am 11:09 AM

Building a Semantic Search Engine with OpenAI, Go, and PostgreSQL (pgvector)

In recent years, vector embeddings have become the foundation of modern natural language processing (NLP) and semantic search. Instead of relying on keyword searches, vector databases compare the "meaning" of text through numerical representations (embeddings). This example demonstrates how to create a semantic search engine using OpenAI embedding, Go, and PostgreSQL with the pgvector extension.

What is embedding?

Embedding is a vector representation of text (or other data) in a high-dimensional space. If two pieces of text are semantically similar, their vectors will be close to each other in this space. By storing embeddings in a database like PostgreSQL (with the pgvector extension), we can perform similarity searches quickly and accurately.

Why choose PostgreSQL and pgvector?

pgvector is a popular extension that adds vector data types to PostgreSQL. It enables you to:

  • Store embeddings as vector columns
  • Perform an approximate or exact nearest neighbor search
  • Run queries using standard SQL

App Overview

  1. Call OpenAI’s embedding API to convert input text into vector embeddings.
  2. Use the pgvector extension to store these embeddings in PostgreSQL.
  3. Query embeddings to find the most semantically similar entries in the database.

Prerequisites

  • Go installed (1.19 recommended).
  • PostgreSQL installed and running (local or hosted).
  • Install the pgvector extension in PostgreSQL. (See pgvector’s GitHub page for installation instructions.)
  • OpenAI API key with embedded access.

Makefile containing tasks related to postgres/pgvector and Docker for local testing.

pgvector:
    @docker run -d \
        --name pgvector \
        -e POSTGRES_USER=admin \
        -e POSTGRES_PASSWORD=admin \
        -e POSTGRES_DB=vectordb \
        -v pgvector_data:/var/lib/postgresql/data \
        -p 5432:5432 \
        pgvector/pgvector:pg17
psql:
    @psql -h localhost -U admin -d vectordb

Make sure pgvector is installed. Then, in your PostgreSQL database:

CREATE EXTENSION IF NOT EXISTS vector;

Full code

package main

import (
    "context"
    "fmt"
    "log"
    "os"
    "strings"

    "github.com/jackc/pgx/v5/pgxpool"
    "github.com/joho/godotenv"
    "github.com/sashabaranov/go-openai"
)

func floats32ToString(floats []float32) string {
    strVals := make([]string, len(floats))
    for i, val := range floats {
        // 將每個(gè)浮點(diǎn)數(shù)格式化為字符串
        strVals[i] = fmt.Sprintf("%f", val)
    }

    // 使用逗號(hào) + 空格連接它們
    joined := strings.Join(strVals, ", ")

    // pgvector 需要方括號(hào)表示法才能輸入向量,例如 [0.1, 0.2, 0.3]
    return "[" + joined + "]"
}

func main() {
    // 加載環(huán)境變量
    err := godotenv.Load()
    if err != nil {
        log.Fatal("加載 .env 文件出錯(cuò)")
    }

    // 創(chuàng)建連接池
    dbpool, err := pgxpool.New(context.Background(), os.Getenv("DATABASE_URL"))
    if err != nil {
        fmt.Fprintf(os.Stderr, "無法創(chuàng)建連接池:%v\n", err)
        os.Exit(1)
    }
    defer dbpool.Close()

    // 1. 確保已啟用 pgvector 擴(kuò)展
    _, err = dbpool.Exec(context.Background(), "CREATE EXTENSION IF NOT EXISTS vector;")
    if err != nil {
        log.Fatalf("創(chuàng)建擴(kuò)展失?。?v\n", err)
        os.Exit(1)
    }

    // 2. 創(chuàng)建表(如果不存在)
    createTableSQL := `
    CREATE TABLE IF NOT EXISTS documents (
        id SERIAL PRIMARY KEY,
        content TEXT,
        embedding vector(1536)
    );
    `
    _, err = dbpool.Exec(context.Background(), createTableSQL)
    if err != nil {
        log.Fatalf("創(chuàng)建表失?。?v\n", err)
    }

    // 3. 創(chuàng)建索引(如果不存在)
    createIndexSQL := `
    CREATE INDEX IF NOT EXISTS documents_embedding_idx
    ON documents USING ivfflat (embedding vector_l2_ops) WITH (lists = 100);
    `
    _, err = dbpool.Exec(context.Background(), createIndexSQL)
    if err != nil {
        log.Fatalf("創(chuàng)建索引失敗:%v\n", err)
    }

    // 4. 初始化 OpenAI 客戶端
    apiKey := os.Getenv("OPENAI_API_KEY")
    if apiKey == "" {
        log.Fatal("未設(shè)置 OPENAI_API_KEY")
    }
    openaiClient := openai.NewClient(apiKey)

    // 5. 插入示例文檔
    docs := []string{
        "PostgreSQL 是一個(gè)先進(jìn)的開源關(guān)系數(shù)據(jù)庫。",
        "OpenAI 提供基于 GPT 的模型來生成文本嵌入。",
        "pgvector 允許將嵌入存儲(chǔ)在 Postgres 數(shù)據(jù)庫中。",
    }

    for _, doc := range docs {
        err = insertDocument(context.Background(), dbpool, openaiClient, doc)
        if err != nil {
            log.Printf("插入文檔“%s”失?。?v\n", doc, err)
        }
    }

    // 6. 查詢相似性
    queryText := "如何在 Postgres 中存儲(chǔ)嵌入?"
    similarDocs, err := searchSimilarDocuments(context.Background(), dbpool, openaiClient, queryText, 5)
    if err != nil {
        log.Fatalf("搜索失?。?v\n", err)
    }

    fmt.Println("=== 最相似的文檔 ===")
    for _, doc := range similarDocs {
        fmt.Printf("- %s\n", doc)
    }
}

// insertDocument 使用 OpenAI API 為 `content` 生成嵌入,并將其插入 documents 表中。
func insertDocument(ctx context.Context, dbpool *pgxpool.Pool, client *openai.Client, content string) error {
    // 1) 從 OpenAI 獲取嵌入
    embedResp, err := client.CreateEmbeddings(ctx, openai.EmbeddingRequest{
        Model: openai.AdaEmbeddingV2, // "text-embedding-ada-002"
        Input: []string{content},
    })
    if err != nil {
        return fmt.Errorf("CreateEmbeddings API 調(diào)用失?。?w", err)
    }

    // 2) 將嵌入轉(zhuǎn)換為 pgvector 的方括號(hào)字符串
    embedding := embedResp.Data[0].Embedding // []float32
    embeddingStr := floats32ToString(embedding)

    // 3) 插入 PostgreSQL
    insertSQL := `
        INSERT INTO documents (content, embedding)
        VALUES (, ::vector)
    `
    _, err = dbpool.Exec(ctx, insertSQL, content, embeddingStr)
    if err != nil {
        return fmt.Errorf("插入文檔失?。?w", err)
    }

    return nil
}

// searchSimilarDocuments 獲取用戶查詢的嵌入,并根據(jù)向量相似性返回前 k 個(gè)相似的文檔。
func searchSimilarDocuments(ctx context.Context, pool *pgxpool.Pool, client *openai.Client, query string, k int) ([]string, error) {
    // 1) 通過 OpenAI 獲取用戶查詢的嵌入
    embedResp, err := client.CreateEmbeddings(ctx, openai.EmbeddingRequest{
        Model: openai.AdaEmbeddingV2, // "text-embedding-ada-002"
        Input: []string{query},
    })
    if err != nil {
        return nil, fmt.Errorf("CreateEmbeddings API 調(diào)用失?。?w", err)
    }

    // 2) 將 OpenAI 嵌入轉(zhuǎn)換為 pgvector 的方括號(hào)字符串格式
    queryEmbedding := embedResp.Data[0].Embedding // []float32
    queryEmbeddingStr := floats32ToString(queryEmbedding)
    // 例如 "[0.123456, 0.789012, ...]"

    // 3) 構(gòu)建按向量相似性排序的 SELECT 語句
    selectSQL := fmt.Sprintf(`
        SELECT content
        FROM documents
        ORDER BY embedding <-> '%s'::vector
        LIMIT %d;
    `, queryEmbeddingStr, k)

    // 4) 運(yùn)行查詢
    rows, err := pool.Query(ctx, selectSQL)
    if err != nil {
        return nil, fmt.Errorf("查詢文檔失敗:%w", err)
    }
    defer rows.Close()

    // 5) 讀取匹配的文檔
    var contents []string
    for rows.Next() {
        var content string
        if err := rows.Scan(&content); err != nil {
            return nil, fmt.Errorf("掃描行失?。?w", err)
        }
        contents = append(contents, content)
    }
    if err = rows.Err(); err != nil {
        return nil, fmt.Errorf("行迭代錯(cuò)誤:%w", err)
    }

    return contents, nil
}

Conclusion

OpenAI embeddings in PostgreSQL, Go and pgvector provide a straightforward solution for building semantic search applications. By representing text as vectors and leveraging the power of database indexes, we move from traditional keyword-based searches to searching by context and meaning.

This revised output maintains the original language style, rephrases sentences for originality, and keeps the image in the same format and location. The code is also slightly improved for clarity and readability. The key changes include more descriptive variable names and comments.

The above is the detailed content of Building a Semantic Search Engine with OpenAI, Go, and PostgreSQL (pgvector). For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

PHP Tutorial
1500
276
Is golang frontend or backend Is golang frontend or backend Jul 08, 2025 am 01:44 AM

Golang is mainly used for back-end development, but it can also play an indirect role in the front-end field. Its design goals focus on high-performance, concurrent processing and system-level programming, and are suitable for building back-end applications such as API servers, microservices, distributed systems, database operations and CLI tools. Although Golang is not the mainstream language for web front-end, it can be compiled into JavaScript through GopherJS, run on WebAssembly through TinyGo, or generate HTML pages with a template engine to participate in front-end development. However, modern front-end development still needs to rely on JavaScript/TypeScript and its ecosystem. Therefore, Golang is more suitable for the technology stack selection with high-performance backend as the core.

How to install Go How to install Go Jul 09, 2025 am 02:37 AM

The key to installing Go is to select the correct version, configure environment variables, and verify the installation. 1. Go to the official website to download the installation package of the corresponding system. Windows uses .msi files, macOS uses .pkg files, Linux uses .tar.gz files and unzip them to /usr/local directory; 2. Configure environment variables, edit ~/.bashrc or ~/.zshrc in Linux/macOS to add PATH and GOPATH, and Windows set PATH to Go in the system properties; 3. Use the government command to verify the installation, and run the test program hello.go to confirm that the compilation and execution are normal. PATH settings and loops throughout the process

How to build a GraphQL API in golang How to build a GraphQL API in golang Jul 08, 2025 am 01:03 AM

To build a GraphQLAPI in Go, it is recommended to use the gqlgen library to improve development efficiency. 1. First select the appropriate library, such as gqlgen, which supports automatic code generation based on schema; 2. Then define GraphQLschema, describe the API structure and query portal, such as defining Post types and query methods; 3. Then initialize the project and generate basic code to implement business logic in resolver; 4. Finally, connect GraphQLhandler to HTTPserver and test the API through the built-in Playground. Notes include field naming specifications, error handling, performance optimization and security settings to ensure project maintenance

Go sync.WaitGroup example Go sync.WaitGroup example Jul 09, 2025 am 01:48 AM

sync.WaitGroup is used to wait for a group of goroutines to complete the task. Its core is to work together through three methods: Add, Done, and Wait. 1.Add(n) Set the number of goroutines to wait; 2.Done() is called at the end of each goroutine, and the count is reduced by one; 3.Wait() blocks the main coroutine until all tasks are completed. When using it, please note: Add should be called outside the goroutine, avoid duplicate Wait, and be sure to ensure that Don is called. It is recommended to use it with defer. It is common in concurrent crawling of web pages, batch data processing and other scenarios, and can effectively control the concurrency process.

Go embed package tutorial Go embed package tutorial Jul 09, 2025 am 02:46 AM

Using Go's embed package can easily embed static resources into binary, suitable for web services to package HTML, CSS, pictures and other files. 1. Declare the embedded resource to add //go:embed comment before the variable, such as embedding a single file hello.txt; 2. It can be embedded in the entire directory such as static/*, and realize multi-file packaging through embed.FS; 3. It is recommended to switch the disk loading mode through buildtag or environment variables to improve efficiency; 4. Pay attention to path accuracy, file size limitations and read-only characteristics of embedded resources. Rational use of embed can simplify deployment and optimize project structure.

Go for Audio/Video Processing Go for Audio/Video Processing Jul 20, 2025 am 04:14 AM

The core of audio and video processing lies in understanding the basic process and optimization methods. 1. The basic process includes acquisition, encoding, transmission, decoding and playback, and each link has technical difficulties; 2. Common problems such as audio and video aberration, lag delay, sound noise, blurred picture, etc. can be solved through synchronous adjustment, coding optimization, noise reduction module, parameter adjustment, etc.; 3. It is recommended to use FFmpeg, OpenCV, WebRTC, GStreamer and other tools to achieve functions; 4. In terms of performance management, we should pay attention to hardware acceleration, reasonable setting of resolution frame rates, control concurrency and memory leakage problems. Mastering these key points will help improve development efficiency and user experience.

How to build a web server in Go How to build a web server in Go Jul 15, 2025 am 03:05 AM

It is not difficult to build a web server written in Go. The core lies in using the net/http package to implement basic services. 1. Use net/http to start the simplest server: register processing functions and listen to ports through a few lines of code; 2. Routing management: Use ServeMux to organize multiple interface paths for easy structured management; 3. Common practices: group routing by functional modules, and use third-party libraries to support complex matching; 4. Static file service: provide HTML, CSS and JS files through http.FileServer; 5. Performance and security: enable HTTPS, limit the size of the request body, and set timeout to improve security and performance. After mastering these key points, it will be easier to expand functionality.

Go select with default case Go select with default case Jul 14, 2025 am 02:54 AM

The purpose of select plus default is to allow select to perform default behavior when no other branches are ready to avoid program blocking. 1. When receiving data from the channel without blocking, if the channel is empty, it will directly enter the default branch; 2. In combination with time. After or ticker, try to send data regularly. If the channel is full, it will not block and skip; 3. Prevent deadlocks, avoid program stuck when uncertain whether the channel is closed; when using it, please note that the default branch will be executed immediately and cannot be abused, and default and case are mutually exclusive and will not be executed at the same time.

See all articles