想擴展你的數(shù)據(jù)庫嗎?那么先了解一下I/O
Jun 07, 2016 pm 03:48 PM本文選自 HighScalability上的一篇博客文章,作者系 Tokutek公司(Tokutek能夠提升MongoDB, MySQL以及MariaDB的性能20倍)的一位工程師,該公司研究的主要方向是存儲引擎。CSDN編譯整理如下: 作為一名軟件開發(fā)者,我們非??粗啬切┏橄蠡臇|西。API越簡單
本文選自 HighScalability上的一篇博客文章,作者系 Tokutek公司(Tokutek能夠提升MongoDB, MySQL以及MariaDB的性能20倍)的一位工程師,該公司研究的主要方向是存儲引擎。CSDN編譯整理如下:?
作為一名軟件開發(fā)者,我們非??粗啬切┏橄蠡臇|西。API越簡單,對我們越有吸引力。辯證地講,MongoDB最大的優(yōu)勢就是“優(yōu)雅”的API和它的敏捷性,這讓開發(fā)者的編碼過程變得異常的簡單。
但是,當(dāng)MongoDB涉及到大數(shù)據(jù)可擴展性問題時,開發(fā)者還是需要了解一下它的底層,弄明白那些潛在的問題,然后才能快速地進行解決。如果不理解,最終可能會選擇一個低效的解決方案,而且浪費了時間和金錢。本文重點介紹了,如何為大數(shù)據(jù)的擴展性問題找個一個高效的解決方案。?
定義問題?
首先,我們要確定應(yīng)用的上下文,本文主要討論的是MongoDB的應(yīng)用程序。這意味著,我們將研究一個分布式文檔存儲數(shù)據(jù)庫,而且它還支持二級索引和分片集群。如果是針對其他的NoSQL產(chǎn)品,像Riak或者Cassandra,我們可能會討論I/O瓶頸問題,而本文主要關(guān)注MongoDB的一些特性。?
其次,這些應(yīng)用能夠做什么?是做聯(lián)機事務(wù)處理( OLTP)還是做聯(lián)機分析處理( OLAP)?本文主要討論的是OLTP,因為對MongoDB而言,OLAP還是一個不小的挑戰(zhàn),或者說基本不能夠進行處理。?
第三,大數(shù)據(jù)是什么?通過大數(shù)據(jù),我們能夠處理和使用更多的數(shù)據(jù),不再局限于單機RAM中的那些部分。這樣的話,有些數(shù)據(jù)保留在服務(wù)器上,而更多的數(shù)據(jù)則是存放在磁盤中,這就需要I/O的訪問。但是請注意,我們不是在討論數(shù)據(jù)庫夠不夠大,而是關(guān)注那些經(jīng)常被存取和使用的數(shù)據(jù)(有時稱之為“工作集”)是不是很小。比如說,磁盤上雖然存儲了好幾年的數(shù)據(jù),但是應(yīng)用可能經(jīng)常訪問的只有最后一天的數(shù)據(jù)。?
第四,OLTP應(yīng)用的限制性因素有哪些?簡而言之,就是I/O。硬盤驅(qū)動每秒鐘只可以啟動上百次的I/O,而另一方面,RAM每秒可以實現(xiàn)數(shù)百萬次的存取,這個限制性因素就是導(dǎo)致大數(shù)據(jù)應(yīng)用I/O瓶頸的原因所在。?
最后,我們應(yīng)該如何解決I/O瓶頸?通過分析思維,公式和直接指令給我們提供了很多種方式,但是一個持久性的解決方案就需要“理解”。用戶必須著眼于應(yīng)用程序的I/O特性,然后才能做出最好的設(shè)計決策。
開銷模型?
未來解決I/O瓶頸,第一步需要掌握哪些數(shù)據(jù)庫操作會包括I/O。 無論MongoDB,還是其他的數(shù)據(jù)庫類型,都有三種基本的操作:
Point Query:查找一個獨立的文件。在一個給定的位置的文件夾(磁盤或者內(nèi)存上),檢索該文檔。對于大數(shù)據(jù)來說,該文件可能不在內(nèi)存中。此操作可能會導(dǎo)致一次I/O。
Range Query:在索引中,查找大量的連續(xù)性文件,對比Point Query而言,它是一個更高效的查詢操作。這是因為我們查找的這些數(shù)據(jù)都是打包存放在磁盤上,可以通過極少的I/O操作來直接讀入內(nèi)存。Range Query一般檢索100個文件才會啟動一次I/O,相對比,100個Point Query檢索100個文件可能就需要100次I/O操作。
Write:寫文件到數(shù)據(jù)庫中。類似MongoDB這樣的數(shù)據(jù)庫,都會產(chǎn)生I/O。而對那些“寫優(yōu)化”數(shù)據(jù)結(jié)構(gòu)的數(shù)據(jù)庫而言,比如 TokuMX,僅僅需要很少的I/O。不像MongDB,“寫優(yōu)化”的數(shù)據(jù)結(jié)構(gòu)能夠通過執(zhí)行多次插入來分?jǐn)侷/O。
在了解三個基本操作對I/O的影響之后,還需要理解MongoDB數(shù)據(jù)庫語句對I/O的影響。MongoDB包含了這三個基本操作,同時還構(gòu)建了四個用戶級別的操作:?
- 插入:將一個新文件寫到數(shù)據(jù)庫中。
- 查詢:在集合上使用索引,這樣做一個Range Queries和Point Query的整合。如果該索引是一個覆蓋索引或者是集群索引,那么接下來基本上只需要做范圍查詢。否則的話,整合的范圍查詢和點查詢就會被啟用。
- 修改和刪除:這是一個查詢和寫操作的整合。查詢操作用于發(fā)現(xiàn)那些需要更新和刪除的文件,然后寫操作再對這些文件進行修改或者是刪除。
現(xiàn)在,我們理解了開銷模型。不過為了解決I/O的瓶頸問題,用戶還需要知道哪些應(yīng)用啟動了I/O操作。這就需要我們了解數(shù)據(jù)庫的行為。I/O啟動是源于查詢操作嗎?如果是這樣的話,查詢行為是如何影響I/O的?還是源于修改操作?如果是因為修改導(dǎo)致的影響,那么是因為修改過程中的查詢操作還是插入操作?一旦用戶掌握了哪些因素會影響 I/O,接下來就可以逐步來解決瓶頸的問題了。
假設(shè)我們明白了某個應(yīng)用的I/O特性,我們就可以探討幾種途徑來解決這一問題。我最喜歡的方式是這樣的:首先嘗試使用軟件來解決該問題,如果不能完美的解決,那么再考慮硬件。畢竟軟件的成本更低且易于維護。?

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Apple's latest releases of iOS18, iPadOS18 and macOS Sequoia systems have added an important feature to the Photos application, designed to help users easily recover photos and videos lost or damaged due to various reasons. The new feature introduces an album called "Recovered" in the Tools section of the Photos app that will automatically appear when a user has pictures or videos on their device that are not part of their photo library. The emergence of the "Recovered" album provides a solution for photos and videos lost due to database corruption, the camera application not saving to the photo library correctly, or a third-party application managing the photo library. Users only need a few simple steps

JSON data can be saved into a MySQL database by using the gjson library or the json.Unmarshal function. The gjson library provides convenience methods to parse JSON fields, and the json.Unmarshal function requires a target type pointer to unmarshal JSON data. Both methods require preparing SQL statements and performing insert operations to persist the data into the database.

MySQL is an open source relational database management system. 1) Create database and tables: Use the CREATEDATABASE and CREATETABLE commands. 2) Basic operations: INSERT, UPDATE, DELETE and SELECT. 3) Advanced operations: JOIN, subquery and transaction processing. 4) Debugging skills: Check syntax, data type and permissions. 5) Optimization suggestions: Use indexes, avoid SELECT* and use transactions.

To handle database connection errors in PHP, you can use the following steps: Use mysqli_connect_errno() to obtain the error code. Use mysqli_connect_error() to get the error message. By capturing and logging these error messages, database connection issues can be easily identified and resolved, ensuring the smooth running of your application.

Oracle is not only a database company, but also a leader in cloud computing and ERP systems. 1. Oracle provides comprehensive solutions from database to cloud services and ERP systems. 2. OracleCloud challenges AWS and Azure, providing IaaS, PaaS and SaaS services. 3. Oracle's ERP systems such as E-BusinessSuite and FusionApplications help enterprises optimize operations.

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

MySQL is suitable for web applications and content management systems and is popular for its open source, high performance and ease of use. 1) Compared with PostgreSQL, MySQL performs better in simple queries and high concurrent read operations. 2) Compared with Oracle, MySQL is more popular among small and medium-sized enterprises because of its open source and low cost. 3) Compared with Microsoft SQL Server, MySQL is more suitable for cross-platform applications. 4) Unlike MongoDB, MySQL is more suitable for structured data and transaction processing.

MySQL is suitable for beginners because it is easy to use and powerful. 1.MySQL is a relational database, and uses SQL for CRUD operations. 2. It is simple to install and requires the root user password to be configured. 3. Use INSERT, UPDATE, DELETE, and SELECT to perform data operations. 4. ORDERBY, WHERE and JOIN can be used for complex queries. 5. Debugging requires checking the syntax and use EXPLAIN to analyze the query. 6. Optimization suggestions include using indexes, choosing the right data type and good programming habits.
