I’m guessing this is a chapter on concurrency and other things
Many things can go wrong in data systems
- Software or hardware may fail at any time
- Application may crash at any time (Including during a series of operations)
- Writing to Database at the same time
- Race conditions between clients can cause surprising bugs
A Reliable System needs to deal with these faults
For decades, transactions have beent he mechanism of choice for simplifying these issues.
A transaction is a way for an application to group several reads and writes together into a logical unit.
- Transaction succeeds (Commit) or fails (Abort and rollback)
- They’re not a law of nature, just a programming paradigm to simplify the programming model for applications accessing a database.
- Programs can ignore certain potential error scenarios and concurrency issues
This chapter will examine examples of things that can go wrong, and see algorithms that databases use to guard against the issues.
- Will also go deep into area of currency control
Slippery Concept of a Transaction
Almost all relational databases today, and some nonrelational databases support transactions. General idea has remained the same since 1975 (IBM System R, the first SQL DB)
Transactions have advantages and limitations
- Benefits
- ACID I know ACID so I won’t bother writing it here, but here’s some fun facts
- ACID is pretty loosely defined. Specifically Isolated
- It’s a mostly just a marketing term
- Systems that aren’t ACID are called BASE (Basically available, soft state, eventually consistent)
- Even more vague than ACID
- Anything that isn’t ACID
Atomicity
- One thread will execute an operation
- Other threads won’t see half actions
- System can only be in full or none state
- It’s not about concurrency, Isolation handles when several processes try to access the same data at the same time
- It’s focused on what happens when data is being processed and not yet done
Consistency
- Kleppman says word is terribly overloaded
- Consistent hashing / eventual consistency / replica consistency
- CAP theorem uses consistency to mean linearizability
- Means that there’s certain statements about data that must always be true
- Accounting system -> Credits and debits must always balance to zero
- If a transactions starts with a valid database according to those invariants, and any writes preserve the validity, then the constraints are always satisfied
- Need to have constraints in place (Or prevent application code) from adding in weird data that violates the invariants rules
- This is not only the job of the database
Isolation
- Most databases are accessed by several clients
- Need to make sure there aren’t any race conditions with modified data
- Isolation means that concurrently executing transactions are isolated from each other
Durability
- Data can be stored without fear of losing it
- Usually means that data has been written to nonvalatile storage like a hard drive or SSD
- Usually also involves a write-ahead log or similar, which allows for recovery in the event that data structure on disk are corrupted
- Perfect durability does not exist (See Reliability Chapter)
- Data can be corrupted
- Power outages can happen
- Disk can gradually wear away
- One study found that 30% and 80% of SSD develop at least one bad block during first four years of operation
Single-Object and Multi-Object Operations
He gave examples of how atomicity / isolation work on individual objects / (rows?)
- I’m gonna go ahead and skip this
Many distributed datastores have abandoned multi-object transactions
- There are some cases where it’s useful
- Rows that have a foreign key, you can check that the references remain valid(idk this too much)
- Document Data Model has multiple fields that need to be updated are treated as single object
- Document databases don’t use joins though. They use Denormalizations
- Need to update several documents in one go
- Document databases don’t use joins though. They use Denormalizations
- Secondary indexes in databases
- These are different database objects from a transaction point of view
Handling Errors and Aborts
A key feature of a transaction is that it can be aborted ACID is based on this
Not all system follow this idea
- Leaderless Replication work on a “best effort” basis
- Database will do as much as it can, and if it gets an error, it won’t undo anything
- Application’s responsibility to recover from errors
Kleppmann says errors will happen, but many software developers prefer to only think about the happy path rather than intricacies of error handling
Aborting isn’t perfect
- If transaction succeeds, but the network response fails, client will retry again (Idemptoency will prevent this)
- If error is due to overload, retrying won’t do anything and will make the problem worse
- Only retry after transient errors (deadlock, isolation violation, network error, failover)
- Retry after a permanent error is bad (Like a constraint violation)
Weak Isolation Levels
Concurrency bugs are hard to find by testing (They only appear when timing is unlucky)
- Very difficult to reproduce
Serializable Isolation (Transaction Isolation) have performance costs
To prevent performance costs, systems will use weaker isolation levels.
- Concurrency bugs have led to big loses in money
- People usually just say “Do Serializable” But many population relational databases get by with weak isolation
Don’t blindly rely on tools, we need to develop a good understanding of the kinds of concurrency problems that exist, and how to prevent them. Then, we can build reliable applications using tools at disposal
this section will look at several weak isolation levels that are used in practice and discuss in detail what kinds of race conditions and cannot occur, os that you can decide what levle is appropriate to your aplication.
Then, Kleppmann will discuss serializability in detail
Read Committed
Most basic level of transaction isolation
- Only see data committed (no dirty reads)
- When writing, only overwrite data that has been committed (no dirty writes)
I feel like this is pretty straight forward right?
- Dirty read is seeing data that has not yet been committed by a transaction
Why prevent these?
- User sees an unread email, but the counter isn’t updated
- Seeing partially updated state is confusing for users
No Dirty Writes
- Two transactions try to update the database at same time
- Don’t want to overwrite a non-committed value
- If transactions updates multiple objects, this can be bad
- We lose data from that transaction
- Alice and Bob try to buy same car and let’s say it requires to DB writes
- Sale can be awarded to bob, but the invoice can be sent to Alice
Implementing Read Committed
Very popular Isolation Level
- It’s a setting in PostgreSQL, SQL Server 2012, MemSQL, and many others
Databases solve this by using row-level locks. When a transaction wants to modify an object, they must acquire a lock first and hold it until transaction is complete.
Preventing dirty reads
- Can use this same lock idea. But one long-write can prevent any reads from reading
- Most databases prefer to just have two versions of data, one pre-commit, and one post-commit
- They show read objects the old value
Snapshot Isolation and Repeatable Read
What concurrency bugs is this level missing?
- Alice could view database in inconsistent state
- Alice can see her balance before transferring money to someone
- Someone sending her money at the same time can have a different balance
- For a moment, Alice will see money taken out of the other persons account, but the money in her account hasn’t changed at all
- Alice can see her balance before transferring money to someone
This is called a non-repeatable read or read skew
- Kleppman says Skew is unfortunately an overloaded word
Ofc this isn’t a lasting problem, just for brief second. But other cases would make this horrible
- This example is destructive for backups
- Or analytics queries and integrity checks
Snapshot Isolation is the most common solution to this problem.
- Each transaction reads from consistent snapshot of DB
Implementation
Typically use write locks to prevent dirty writes and don’t use any locks onr eads
- Reads never block writes and writers never block readers
Database keeps several different committed versions of an object
- Called -> Multi-version concurrency control (MVCC)
Two versions would be okay in the read committed isolation, but this one needs more
Tpyical approach is that read commited uses a separate snapshot for each query, while snapshot isolation uses the same snapshot for entire transaction
In PostgreSQL, transactions are given IDs and snapshots are made for those IDs
- An update is internally translated into a deleted and a create
- Rows that are marked as “delete”, rows that are freshly updated
- Garbage collection will delete the deleted marked rows after some time
- Have multiple copies stored as rows in the DB
Visibility rules for observing a consistent snapshot
When transcation reads form DB, transactions IDs are used to decide which objects it can see and which are invisible.
By carefully defining visibility rules, the database can present a consistent snapshot of the database to the application.
Works like
- At start of each transaction, the database makes alist of all the other transactions in progress (It ignores those IDs)
- Any writes made by aborted transactions are ignored
- Any writes made by transactions with a later ID are ignored
- All other writes are visible to the application’s queries.
Indexes and snapshot Isolation
How do indexes work in multi-version databse? One option is to have index point to all version of an object.
In practice, many implementation details determine what happens.
- B Trees and using append-only / copy-on-write
Repeatable Read and Naming Confusion
Snapshot isolation is a useful isolation level, especially for read-only transactions. Many databases call it by different names though.
- Oracle calls it serializable
- PostgreSQL and MySQL call it repeatable read
SQL doesn’t have true snapshot isolation, it’s just really similar.
- There’s a big difference in the guarantees it provides
Preventing Lost Updates
Have mostly ignored the presence of having two transactions writing concurrently
Lost update can occur in a read-modify-write cycle (reads data, then commits a modified read)
If two transactions do this concurrently, data is lost
- Incrementing a count
- Two users editing a wiki page
Atomic write operations
Databases provide atomic update operations
Usually implemented by making an exclusive lock on the object that blocks reads until update is finished
- Called cursor stability
Explicit Locking
If database built-in atomic operations aren’t good enough, then application code can check for anything also trying to read-modify-update
This can happen in game development, where multiple players move characters simultaneously. It’s not based on the database.
Need to add locks in the code, but it’s very tricky to do
automatically detecting lost updates
Alternative is to allow these to occur and have a transaction manager detect a lost update, then abort the transaction and force it to retry.
Advantage is that databases can perform this check efficiently in conjunction with snapshot isolation.
- This is what those popular databases from earlier use with their own snapshots.
It’s a great feature
Compare and Set
Databases without transactions sometimes have atomic compare-and-set operation. Purpose of this is to avoid lost updates by allowing an update to happen only if the value has not changed since you last read it.
Check to see if old value is different from when the read-modify-write cycle occured
Conflict Resolution and Replication
Replicated databases are much more difficult
Copies of data on multiple nodes and data can potentially be modified concurrently on different nodes
multi-leader and leaderless replication allow multiple writes to happen concurrently
- techniques based on locks or compare and set won’t work
Common approach is to just allow concurrent writes to have conflicts, then use application code or a data structure to resolve and merge these versions after the fact
Atomic operations can work well too
Write Skew and Phantoms
We’re not at the end of potential race conditions yet
Imagine doctors in a hospital where there must be at least one on call doctor
- If two doctors check out at the same time, a condition can occur
Snapshot isolation will return true for both of the doctors
Characterizing a write skew
This is called write skew
It’s neither a dirty write or a lost update, but definitely a race condition
- Can be generalization of the lost update problem. Write skew can occur if two transactions read the same objects, and then update some of those objects
Atomic single object operations don’t help (multiple objects are involved)
The automoatic detection of lost updates doesn’t help Some databases allow contraints, but most don’t have support for multi-object constraints
If serializable isolation level doesn’t work, then second best option is to explicitly lock the rows
More examples of Write Skew
Meeting Room Booking System
Multiplayer Game Claiming Username Preventing double spending
Phantoms
- All of these examples have a similar pattern
- SELECT query checks for requirement by searching rows
- Depending on result of first query, checks how to continue
- Makes a insert / update / delete to database and commits transaction
- This idea, where the write in one transaction affects a search query, is called phantom
Using a lock to prevent this is called “Materializing Conflicts” and removing the phantom.
- introduces lock basd problems now
- Hard to fix completely
Serializability
We’ve covered some race conditions for read commited and snapshot isolation levels. We covered tricky problems with write skew and phantoms.
- Isolation levels are hard to understand and inconsistently implemented in databases
- Application code is hard to tell if it’s safe to run partiicular isolation levels
- There are no good tools to help detect race conditions
- There’s no tools or research techniques that have found their way to practical use It’s been like this since the 1970s
Serializable Isolation is regarded as the strongest isolation level
- Performance is slower
The rest of this chapter will explore
- Literally executing transactions in a serial order (Actual Serial Execution)
- Two-Phase Locking
- Optimistic concurrency control techniques
Actual Serial Execution
Simplest way is to just remove concurrency entirely is to just only run one transaction at a time, in serial order, on a single thread
This was a new invention that came about from improvements in RAM
- RAM was cheap enough to keep a database in memory for transactions*
- designers realized that OLTP transactions are usually short and very small
Encapsulating transactions in stored procedures
Databases can’t just sit around waiting for users to make up their minds if they want to book an airline ticket
Transactions are broken up, databaes don’t allow multi-statement transactions
Application must submit the entire transaction code to the database ahead of a time as a stored procedure
Pros
- Code running in DB is diffiuclt to manager compared to application server
- No version contorl, hard to test, hard to have metrics
- Database is much more performance sensitive
Stored procedures and in-memory data allow executing all transactions on a single thread
Partitioning
Makes concurrency control simpler, but limits transaction throughput of the database to the speed of a single CPU core on a single machine.
Tough for high-write throughput, the single-threaded transaction processor can become a bottleneck.
To scale with multiple CPU cores, can potentially partition data (supported in VoltDB)
- Doesn’t work for transactions that need to access multiple partitions
- VoltDB reports a throughout of about 1,000 cross-partition writes per second, which is order of magnitude slower than its single parition throughput and cannot be fixed
Two-Phase Locking (2PL)
For around 30 years, this was the only algorithm for serializability in databases
- 2PL is NOT 2PC (two phase commit). They are different
Several transactions are allowed to concurrently read the same object as long as nobody is writing to it.
When someone wants to write to object, exclusive access is required
- If transaction A has read an object and B wants to write, B must wait for A to finish w/ commit or abort before it can continue
- If transactions A has written object, and B wants to read, B must wait until A commits or aborts
- Not allowed to read old versions of objects
Readers and Writers get blocked
This algorithm protects against race conditions
Implementation
MySQL(InnoDB) and SQL Server have this available in their serializable isolation level
Blocking of readers and writers is implemented by a lock on each object
- Shared mode and an exclusive mode
Reads must require a lock in shared mode Writes must acquire a lock in exclusive mode Transactions may upgrade from shared to exclusive when they read and then write Once lock is acquired, hold lock until finished with transaction
Called two-phase because first phase is when locks are acquired, second phase is when transaction executes
Performance of Two Phase Locking
Performance is very slow compared to weak isolation
- This is why everybody hasn’t been using it since 1970s
Very easy to have a queue of transactions that need to be run
Predicate Locks
Can add locks to search queries to like shared / exclusive to prevent write skew
predicate lock applies even to objects that do not yet exist int he database, bu tmigh tin future.
Two phase locks include predicate locks that prevent all forms of write skew and other race conditions
Index-range locks
Predicate locks do not perform well
Most 2PL DBs implement index-range locking
Don’t need to lock exact queries, Its safe to simplify a predicate by making the query range wide and locking it wide
Serializable Snapshot Isolation (SSI)
Two phase locking doesn’t perform well Serial Execution doesn’t scale well Weak isolation have good performance but have race conditions
Serializable snapshot isolation seems to be promising
- Gives full serializability, bu tonly have a small performance penalty compared to snapshot
First used in 2008 and is the PhD thesis of Michael Cahill
Today (???) SSI is used in both single-node databases (PostgreSQL) and distributed databases(FoundationDB)
Pessimisitc versus optimisic concurrency ocntorl
2PL is called pessimistic
Serializable snapshot is an optimisitic concurrency control technique
I’m not gonna bother writing these, Check out the chapter (Or read Alex Xu’s book)
Can reduce contention weakenss of optimisitic locking using atomic operations
- If all transcations want to increment a counter allow them because order doens’t matter
Decisions based on an outdated premise
write skew occurs from having an incorrect premise (Data conditions are met and can write)
Detecting stale MVCC reads
I’m gonna skip these two pages
Something about how to detect faulty reads / bad premises when using multiple versions of objects
Detecting writes that affect prior reads
Also I will skip this one
Performance of serializable snapshot isolation
skipping..
It depends on the cases