前言

lab4为了保证事务的原子性，我们在事务提交时才对脏页进行写盘，也就是no-steal/force策略

no-steal：磁盘上不会存在uncommitted数据
force：事务在committed之后必须将所有更新立刻持久化到磁盘

在这个lab中，我们要实现的是steal/no-force策略，对于steal策略，如果数据库异常重启，为了保证事务的原子性，我们需要uodo-log来保证数据库crash的时候能够回滚；另外在lab4中的force策略，每一次的committed都会有一次磁盘的IO，为了避免造成大量的IO开销，no-force策略可以对脏页的更新进行批量操作，同时为了保证事务的持久性，我们需要一个redo-log来保证数据库crash的时候能够恢复数据。

steal：磁盘上存在uncommitted的数据
no-force：事务在committed之后可以不立即持久化到磁盘

完整代码

一、关于lab6？

In this lab you will implement log-based rollback for aborts and log-based crash recovery. We supply you with the code that defines the log format and appends records to a log file at appropriate times during transactions. You will implement rollback and recovery using the contents of the log file.
The logging code we provide generates records intended for physical whole-page undo and redo. When a page is first read in, our code remembers the original content of the page as a before-image. When a transaction updates a page, the corresponding log record contains that remembered before-image as well as the content of the page after modification as an after-image. You’ll use the before-image to roll back during aborts and to undo loser transactions during recovery, and the after-image to redo winners during recovery.
We are able to get away with doing whole-page physical UNDO (while ARIES must do logical UNDO) because we are doing page level locking and because we have no indices which may have a different structure at UNDO time than when the log was initially written. The reason page-level locking simplifies things is that if a transaction modified a page, it must have had an exclusive lock on it, which means no other transaction was concurrently modifying it, so we can UNDO changes to
it by just overwriting the whole page.
Your BufferPool already implements abort by deleting dirty pages, and pretends to implement atomic commit by forcing dirty pages to disk only at commit time. Logging allows more flexible buffer management (STEAL and NO-FORCE), and our test code calls BufferPool.flushAllPages() at certain points in order to exercise that flexibility.

关于lab6提供的日志格式：ABORT, COMMIT, UPDATE, BEGIN, CHECKPOINT，分别记录事务失败、事务提交、写入磁盘前的脏页、事务开始、检测点，这些格式的日志都记录在同一个日志文件中

The format of the log file is as follows:

The first long integer of the file represents the offset of the last written checkpoint, or -1 if there are no checkpoints

All additional data in the log consists of log records. Log records are variable length.

Each log record begins with an integer type and a long integer transaction id.

Each log record ends with a long integer file offset representing the position in the log file where the record began.

There are five record types: ABORT, COMMIT, UPDATE, BEGIN, and CHECKPOINT

ABORT, COMMIT, and BEGIN records contain no additional data

UPDATE RECORDS consist of two entries, a before image and an after image. These images are serialized Page objects, and can be accessed with the LogFile.readPageData() and LogFile.writePageData() methods. See LogFile.print() for an example.

CHECKPOINT records consist of active transactions at the time the checkpoint was taken and their first log record on disk. The format of the record is an integer count of the number of transactions, as well as a long integer transaction id and a long integer first record offset for each active transaction.

每种日志记录都以int记录的类型和long记录的事务id开头，以一个long长度的偏移量结束。对于ABORT, COMMIT, and BEGIN这三种记录，不包含另外的data；对于UPDATE格式的记录，由before-image和after-image分别记录修改前和修改后的日志；事务提交失败回滚我们会用到before-image，事务提交成功但数据由于故障丢失数据我们会用到after-image；对于CHECKPOINT 记录，主要记录活跃的事务数，以及每个活跃事务的的事务id和第一条日志记录的偏移量；

在崩溃恢复时，在checkpoint之前的修改已经是刷入磁盘的；对于checkpoint之后的日志，我们只保证修改持久化到日志，数据库发生crash的时候，为了保证事务的持久化，从checkpoint开始往后读，根据日志记录进行恢复。

课程地址

Lab地址

二、lab6

1.Exercise 1

lab6中的回滚是通过before-image来实现的，在每次将数据页写入磁盘前需要用logWrite方法来记录日志：

在这里插入图片描述

那么我们在回滚时将before-image的数据写回磁盘即可，首先根据tidToFirstLogRecord获取该事务第一条记录的位置，再移动到日志开始的地方，根据日志格式进行读取日志记录，读到update格式的记录时根据事务id判断是否为要修改的日志，如果是，写before-image即可。

    public void rollback(TransactionId tid)
        throws NoSuchElementException, IOException {
    
    
        synchronized (Database.getBufferPool()) {
    
    
            synchronized(this) {
    
    
                preAppend();
                // some code goes here

                //根据tidToFirstLogRecord获取该事务第一条记录的位置
                Long firstLogRecord = tidToFirstLogRecord.get(tid.getId());

                //移动到日志开始的地方
                raf.seek(firstLogRecord);
                Set<PageId> set = new HashSet<>();

                //根据日志格式进行读取日志记录，读到update格式的记录时根据事务id判断是否为要修改的日志，如果是，写before image
                while (true) {
    
    
                    try {
    
    
                        //Each log record begins with an integer type and a long integer
                        //transaction id.
                        int type = raf.readInt();
                        long txid = raf.readLong();
                        switch (type) {
    
    
                            case UPDATE_RECORD :
                                //UPDATE RECORDS consist of two entries, a before image and an
                                //after image.  These images are serialized Page objects, and can be
                                //accessed with the LogFile.readPageData() and LogFile.writePageData()
                                //methods.  See LogFile.print() for an example.
                                Page beforeImage = readPageData(raf);
                                Page afterImage = readPageData(raf);
                                PageId pageId = beforeImage.getId();
                                if (txid == tid.getId() && !set.contains(pageId)) {
    
    
                                    set.add(pageId);
                                    Database.getBufferPool().discardPage(pageId);
                                    Database.getCatalog().getDatabaseFile(pageId.getTableId()).writePage(beforeImage);
                                }
                                break;
                            case CHECKPOINT_RECORD:
                                //CHECKPOINT records consist of active transactions at the time
                                //the checkpoint was taken and their first log record on disk.  The format
                                //of the record is an integer count of the number of transactions, as well
                                //as a long integer transaction id and a long integer first record offset
                                //for each active transaction.
                                int txCnt = raf.readInt();
                                while (txCnt -- > 0) {
    
    
                                    raf.readLong();
                                    raf.readLong();
                                }
                                break;
                            default:
                                //others
                                break;
                        }
                        //Each log record ends with a long integer file offset representing the position in the log file where the record began.
                        raf.readLong();
                    } catch (EOFException e) {
    
    
                        break;
                    }
                }
            }
        }
    }

2.Exercise 2

为了实现数据库崩溃恢复，我们找到checkpoint所在位置，然后对checkpoint后面的日志记录进行读取并进行恢复数据即可。对于未提交的事务，使用before-image对其进行恢复；对于已提交的事务：使用after-image对其进行恢复。

    public void recover() throws IOException {
    
    
        synchronized (Database.getBufferPool()) {
    
    
            synchronized (this) {
    
    
                recoveryUndecided = false;
                // some code goes here
                raf = new RandomAccessFile(logFile, "rw");
                //已提交的事务id集合
                Set<Long> committedId = new HashSet<>();
                //存放事务id对应的beforePage和afterPage
                Map<Long, List<Page>> beforePages = new HashMap<>();
                Map<Long, List<Page>> afterPages = new HashMap<>();
                //获取checkpoint
                Long checkpoint = raf.readLong();
                if (checkpoint != -1) {
    
    
//                    raf.seek(checkpoint);
                }
                while (true) {
    
    
                    try {
    
    
                        int type = raf.readInt();
                        long txid = raf.readLong();
                        switch (type) {
    
    
                            case UPDATE_RECORD:
                                Page beforeImage = readPageData(raf);
                                Page afterImage = readPageData(raf);
                                List<Page> l1 = beforePages.getOrDefault(txid, new ArrayList<>());
                                l1.add(beforeImage);
                                beforePages.put(txid, l1);
                                List<Page> l2 = afterPages.getOrDefault(txid, new ArrayList<>());
                                l2.add(afterImage);
                                afterPages.put(txid, l2);
                                break;
                            case COMMIT_RECORD:
                                committedId.add(txid);
                                break;
                            case CHECKPOINT_RECORD:
                                int numTxs = raf.readInt();
                                while (numTxs -- > 0) {
    
    
                                    raf.readLong();
                                    raf.readLong();
                                }
                                break;
                            default:
                                break;
                        }
                        //end
                        raf.readLong();

                    } catch (EOFException e) {
    
    
                        break;
                    }
                }

                //处理未提交事务，直接写before-image
                for (long txid :beforePages.keySet()) {
    
    
                    if (!committedId.contains(txid)) {
    
    
                        List<Page> pages = beforePages.get(txid);
                        for (Page p : pages) {
    
    
                            Database.getCatalog().getDatabaseFile(p.getId().getTableId()).writePage(p);
                        }
                    }
                }

                //处理已提交事务，直接写after-image
                for (long txid : committedId) {
    
    
                    if (afterPages.containsKey(txid)) {
    
    
                        List<Page> pages = afterPages.get(txid);
                        for (Page page : pages) {
    
    
                            Database.getCatalog().getDatabaseFile(page.getId().getTableId()).writePage(page);
                        }
                    }
                }

            }
         }
    }

总结

鸽了半年终于写完6.830的报告了，今天也回到了学校，得开始着手小论文的事情了，加油加油加油！

MIT6.830 lab6 一个简单数据库实现

文章目录

前言

一、关于lab6？

二、lab6

1.Exercise 1

2.Exercise 2

总结

猜你喜欢