国产av日韩一区二区三区精品,成人性爱视频在线观看,国产,欧美,日韩,一区,www.成色av久久成人,2222eeee成人天堂

Home Database Mysql Tutorial Finally Redis collections are iterable

Finally Redis collections are iterable

Jun 07, 2016 pm 04:35 PM
collections finally redis

Redis API for data access is usually limited, but very direct and straightforward. It is limited because it only allows to access data in a natural way, that is, in a data structure obvious way. Sorted sets are easy to access by score rang

Redis API for data access is usually limited, but very direct and straightforward.

It is limited because it only allows to access data in a natural way, that is, in a data structure obvious way. Sorted sets are easy to access by score ranges, while hashes by field name, and so forth.
This API “way” has profound effects on what Redis is and how users organize data into it, because an API that is data-obvious means fast operations, less code and less bugs in the implementation, but especially forcing the application layer to make meaningful choices: the database as a system in which you are responsible of organizing data in a way that makes sense in your application, versus a database as a magical object where you put data inside, and then it will be able to fetch and organize data for you in any format.

However most Redis data types, including the outer key-value shell if we want to consider it a data type for a moment (it is a dictionary after all), are collections of elements. The key-value space is a collection of keys mapped to values, as Sets are collections of unordered elements, and so forth.

In most application logic using Redis the idea is that you know what member is inside a collection, or at least you know what member you should test for existence. But life is not always that easy, and sometimes you need something more, that is, to scan the collection in order to retrieve all the elements inside. And yes, since this is a common need, we have commands like SMEMBERS or HGETALL, or even KEYS, in order to retrieve everything there is inside a collection, but those commands are always a last-resort kind of deal, because they are O(N) operations.

Your collection is very small? Fine, use SMEMBERS and you get Redis-alike performances anyway. Your collection is big? Don’t use O(N) commands if not for “debugging” purposes. A popular example is the misused KEYS command, source of troubles for non-experts, and top hit among the Redis slow log entries.

Black holes
===

The problem is that because of O(N) operations, Redis collections, (excluding Sorted Sets that can be accessed by rank, in ranges, and in many other different ways), tend to be black holes where you put things inside, and you can hardly explore them again.

And there are plenty of reasons to explore what is inside. For example garbage collection tasks, schema migration, or even fixing what is inside keys after an application bug corrupted some data.

What we really needed was an iterator. Pieter Noordhuis and I were very aware of this problem since the early days of Redis, but it was a major design challenge because traditionally the deal is, you want a data structure to be iterable? Well, this is going to be a sorted tree-like data structure, with the concept of previous and next element.

Instead most Redis data types are based on hash tables, and Redis hash tables are even of the most particular kind, as them automatically and incrementally resize, so sometime you even have two tables at the same time slowly exchanging elements from one to the other.

Hash tables are cool because they have a good memory efficiency, and the constant-time access property. What we use is power-of-two sized hash tables, with chaining for collision handling, and this has worked pretty well so far. An indirect indicator is that sorted sets, the only data structure we have that is based on a tree-alike structure (skip lists) is measurably slower than others once elements start to pile up. While Log(N) is small, with million of elements it starts to be a big enough number that cache misses summed together make a difference.

There was no easy way to say goodbye to hash tables.

However eventually Pieter applied one of the most powerful weapons the designer has in its hands: sacrifice, and send me a pull request for a new SCAN command.

The command was able to walk the key space incrementally, using a cursor that is returned back to the user after every call to SCAN, so it is a completely stateless affair. It is something like that:

redis 127.0.0.1:6379> flushall
OK
redis 127.0.0.1:6379> debug populate 33
OK
redis 127.0.0.1:6379> scan 0
1) "52"
2) 1) "key:29"
2) "key:13"
3) "key:9"
4) "key:12"
5) "key:28"
6) "key:30"
7) "key:26"
8) "key:14"
9) "key:21"
10) "key:20"
redis 127.0.0.1:6379> scan 52
1) "9"
2) 1) "key:16"
2) "key:31"
3) "key:3"
4) "key:0"
5) "key:32"
6) "key:17"
7) "key:24"
8) "key:8"
9) "key:15"
10) "key:11"

… and so forth until the returned cursor is 0 again …

This is possible because SCAN does not make big promises, hence the sacrifice: it guarantees to return all the elements that are in the collection from the start to the end of the iteration, at least one time.

This means, for example, that:

1) Elements may be returned multiple times.
2) Elements added during the iteration may be returned, or not, at random.

It turns out that this is a perfectly valid compromise, and that at application level you can almost always do what is needed to play well with this properties. Sometimes the operation you are doing on every element are simply safe to re-apply, some other times you can just put a flag in your (for example) Hash in order to mark it as processed, or a timestamp perhaps, and so forth.

Eventually merged
===

Pieter did an excellent work, but the pull request remained pending forever (more than one year), because it relied on a complex to understand implementation. Basically in order to guarantee the previous properties with tables that can change from one SCAN call to the other, Pieter used a cursor that is incremented inverting the bits, in order to count starting from the most significant bits first. This is why in the example you see the returned cursor jumping forward and backward.

This has different advantages, including the fact that it returns a small number of duplicates when possible (by checking less slots than a more naive implementation). Eventually I studied his code and tried to find a simpler algorithm with the same properties without success, so what I did is to document the algorithm. You can read the description here in the
dict.c file: https://github.com/antirez/redis/blob/unstable/src/dict.c#L663

After you do some whiteboard reasoning, it is not too hard to see how it works, but it is neither trivial, however it works definitely very well.

So with the code merged into unstable, I tried to generalize the implementation in order to work with all the other types in Redis that can be iterated this way, that are Hashes, Sets and Sorted Sets, in the form of additional commands named HSCAN, SSCAN, ZSCAN.

You may wonder why to have a ZSCAN. The reason is that while sorted sets are iterable in other ways, the specific properties of the SCAN iterator are not trivial to simulate by scanning elements by rank or score. Moreover sorted sets are internally implemented by a skiplist and an hash table, so we already had the hash table and to extend SCAN to sorted sets was trivial.

Patterns too!
===

SCAN and its variants can be used with the MATCH option in order to only return elements matching a pattern. I’m also implementing the TYPE option in order to only return keys of a specific type.

This is almost for free, since SCAN does not guarantees to return elements at all, so what happens is that it scans something like 10 buckets of the hash table per call (by default, you can change this) and later filters the output. Even if the return value contains no elements, you keep iterating as soon as the returned cursor is non-zero.

As you can see this is something you could do client-side as well, by matching the returned elements with a pattern, but it is much faster to do it server side given that is a very common pattern, and one that users are already used to because of the KEYS command. And it requires less I/O of course, if the pattern only matches a small number of elements.

This is an example:

edis 127.0.0.1:6379> scan 0 MATCH key:1*
1) "52"
2) 1) "key:13"
2) "key:12"
3) "key:14"
redis 127.0.0.1:6379> scan 52 MATCH key:1*
1) "9"
2) 1) "key:16"
2) "key:17"
3) "key:15"
4) "key:11"
redis 127.0.0.1:6379> scan 9 MATCH key:1*
1) "59"
2) 1) "key:10"
2) "key:1"
3) "key:18"
redis 127.0.0.1:6379> scan 59 MATCH key:1*
1) "0"
2) 1) "key:19"

In the last call the returned cursor is 0, so we ended the iteration.

Excited about it? I’ve good news, this is going to be back ported into 2.8 since it is completely self contained code so if it is broken, it does not affect other stuff. Well not just that, there are very big companies that are using SCAN for some time now, so I’m confident it’s pretty stable.

Enjoy iterating!

Discuss this blog post on Hacker News: https://news.ycombinator.com/item?id=6633091 Comments
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Laravel8 optimization points Laravel8 optimization points Apr 18, 2025 pm 12:24 PM

Laravel 8 provides the following options for performance optimization: Cache configuration: Use Redis to cache drivers, cache facades, cache views, and page snippets. Database optimization: establish indexing, use query scope, and use Eloquent relationships. JavaScript and CSS optimization: Use version control, merge and shrink assets, use CDN. Code optimization: Use Composer installation package, use Laravel helper functions, and follow PSR standards. Monitoring and analysis: Use Laravel Scout, use Telescope, monitor application metrics.

How to use the Redis cache solution to efficiently realize the requirements of product ranking list? How to use the Redis cache solution to efficiently realize the requirements of product ranking list? Apr 19, 2025 pm 11:36 PM

How does the Redis caching solution realize the requirements of product ranking list? During the development process, we often need to deal with the requirements of rankings, such as displaying a...

What should I do if the Redis cache of OAuth2Authorization object fails in Spring Boot? What should I do if the Redis cache of OAuth2Authorization object fails in Spring Boot? Apr 19, 2025 pm 08:03 PM

In SpringBoot, use Redis to cache OAuth2Authorization object. In SpringBoot application, use SpringSecurityOAuth2AuthorizationServer...

Recommended Laravel's best expansion packs: 2024 essential tools Recommended Laravel's best expansion packs: 2024 essential tools Apr 30, 2025 pm 02:18 PM

The essential Laravel extension packages for 2024 include: 1. LaravelDebugbar, used to monitor and debug code; 2. LaravelTelescope, providing detailed application monitoring; 3. LaravelHorizon, managing Redis queue tasks. These expansion packs can improve development efficiency and application performance.

Laravel environment construction and basic configuration (Windows/Mac/Linux) Laravel environment construction and basic configuration (Windows/Mac/Linux) Apr 30, 2025 pm 02:27 PM

The steps to build a Laravel environment on different operating systems are as follows: 1.Windows: Use XAMPP to install PHP and Composer, configure environment variables, and install Laravel. 2.Mac: Use Homebrew to install PHP and Composer and install Laravel. 3.Linux: Use Ubuntu to update the system, install PHP and Composer, and install Laravel. The specific commands and paths of each system are different, but the core steps are consistent to ensure the smooth construction of the Laravel development environment.

Redis's Role: Exploring the Data Storage and Management Capabilities Redis's Role: Exploring the Data Storage and Management Capabilities Apr 22, 2025 am 12:10 AM

Redis plays a key role in data storage and management, and has become the core of modern applications through its multiple data structures and persistence mechanisms. 1) Redis supports data structures such as strings, lists, collections, ordered collections and hash tables, and is suitable for cache and complex business logic. 2) Through two persistence methods, RDB and AOF, Redis ensures reliable storage and rapid recovery of data.

How to configure slow query log in centos redis How to configure slow query log in centos redis Apr 14, 2025 pm 04:54 PM

Enable Redis slow query logs on CentOS system to improve performance diagnostic efficiency. The following steps will guide you through the configuration: Step 1: Locate and edit the Redis configuration file First, find the Redis configuration file, usually located in /etc/redis/redis.conf. Open the configuration file with the following command: sudovi/etc/redis/redis.conf Step 2: Adjust the slow query log parameters in the configuration file, find and modify the following parameters: #slow query threshold (ms)slowlog-log-slower-than10000#Maximum number of entries for slow query log slowlog-max-len

In a multi-node environment, how to ensure that Spring Boot's @Scheduled timing task is executed only on one node? In a multi-node environment, how to ensure that Spring Boot's @Scheduled timing task is executed only on one node? Apr 19, 2025 pm 10:57 PM

The optimization solution for SpringBoot timing tasks in a multi-node environment is developing Spring...

See all articles