Reviewing the Internals of MariaDB’s Vector Index 2

image-1

1. What optimizations does MariaDB apply to vector indexes?

1.1 SIMD Acceleration

1.2 Node Caching to Reduce Storage Engine Access


2. What Is the Transaction Isolation on the Vector Index?

(This section refers specifically to vector indexes on InnoDB.)

1. Read Committed

(tx_isolation = ‘READ COMMITTED’)

2. Repeatable Read

(tx_isolation = ‘REPEATABLE READ’)

Interestingly, under the Repeatable Read isolation level, the correctness of locking reads is not guaranteed by InnoDB’s next-key locks or gap locks. Since gap protection applies to ordered indexes, the concept of a “gap” does not really exist in a vector index structure. However, locking reads in this case effectively behave like Serializable, which still satisfies the requirements of Repeatable Read.

Another notable quirk: the cache layer disrupts normal locking behavior. If a node is found in the cache, no InnoDB-level lock is acquired. Locking only happens on cache misses. This makes the locking behavior somewhat unpredictable under high cache hit rates.


3. Conclusion

Based on the reviews in my last post and this one, I believe MariaDB’s current implementation of vector indexes offers an excellent case study of how to integrate vector search in a relational database. It achieves a strong balance between engineering complexity, performance, and applicability.

Looking forward to seeing even more powerful iterations in the future!