CTools, the learning management system (LMS) for the University of Michigan is based on Sakai. Current versions of Sakai allow for uploads of resources -- binary blobs (files) that are managed internally, but in CTools case are stored on the filesystem. Sakai keeps a map of the original filename and the name as stored on disk (based on a GUID and hashed in a directory structure that looks like Year/DayOfYear/HourOfDay/fileGUID) in a SQL table called content_resource. Deletes of a resource remove the file on disk and the table row entirely. This means undeletes are not possible out-of-the-box. Based on the "event" table one could reconstruct which file was deleted and retrieve it from a filesystem snapshot manually and quite painfully (esp. when it was on AFS as that involved requesting the snapshot volume being mounted etc. which took time).
A couple of years ago, all the CTools resource storage was moved to a Netapp filer (snapmirrored to a standby at the secondary site every 10 minutes) and the files served over NFS to CTools/Sakai. The Netapp was managed by our group and we kept 10 days of snapshots online (readonly). This meant we could automate undeletes -- for now only CTools support staff can request them, but they do so through a web interface which drives a fully automated process not requiring sysadmin hands.
Here's how it works:
A couple of years ago, all the CTools resource storage was moved to a Netapp filer (snapmirrored to a standby at the secondary site every 10 minutes) and the files served over NFS to CTools/Sakai. The Netapp was managed by our group and we kept 10 days of snapshots online (readonly). This meant we could automate undeletes -- for now only CTools support staff can request them, but they do so through a web interface which drives a fully automated process not requiring sysadmin hands.
Here's how it works:
- snapshots are made nightly, so a file has to exist when the snapshot is taken -- files created and deleted before a snapshot is made is not recoverable
- the undelete has to be requested within 10 days of deletion
- a ticket number has to be provided as that is used to name a directory in the support AFS space where the undeleted files are copied to from the snapshot
- CTools uses Oracle, so a table was created:
with an accompanying trigger:
CREATE TABLE CONTENT_RESOURCE_DELETED
( RESOURCE_ID VARCHAR2(256 BYTE),
XML CLOB,
IN_COLLECTION VARCHAR2(256 BYTE),
FILE_PATH VARCHAR2(128 BYTE),
RESOURCE_UUID VARCHAR2(36 BYTE),
DELETE_DATE DATE
) ;
CREATE OR REPLACE TRIGGER TRG_CONTENT_RESOURCE_4_DELETED
AFTER DELETE ON CONTENT_RESOURCE
FOR EACH ROW
BEGIN
INSERT INTO CONTENT_RESOURCE_DELETED
(RESOURCE_ID,XML,IN_COLLECTION,FILE_PATH,RESOURCE_UUID,DELETE_DATE)
VALUES(:OLD.RESOURCE_ID,:OLD.XML,:OLD.IN_COLLECTION,:OLD.FILE_PATH,:OLD.RESOURCE_UUID,SYSDATE);
END; - Each time a file is deleted, the row is copied to the content_resource_deleted table and preserved for 30 days (after which it is dropped).
- The web-interface (a simple JSP that does a read-only lookup of the content_resource _deleted table based on a site_id) allows the support staff to pick the files to be undeleted (it only allows them to undelete files deleted before the last snapshot and deleted at most 11 days ago); picked files are saved to an undelete_queue table
---+---------------+---------------+------+---------+---------------+----+
# | column | type | null | default | pk | fk |
---+---------------+---------------+------+---------+---------------+----+
1 | FILE_PATH | VARCHAR2(128) | NO | [NULL] | SYS_C00121850 | |
2 | RESOURCE_ID | VARCHAR2(256) | NO | [NULL] | | |
3 | TICKET | VARCHAR2(64) | NO | [NULL] | | |
4 | STATUS | VARCHAR2(12) | YES | [NULL] | | |
5 | DELETE_DATE | DATE(7) | YES | [NULL] | | |
6 | INSERTEDBY | VARCHAR2(16) | NO | [NULL] | | |
7 | UPDATEDBY | VARCHAR2(16) | NO | [NULL] | | |
8 | UNDELETE_DATE | DATE(7) | YES | [NULL] | | |
---+---------------+---------------+------+---------+---------------+----+ - a script runs out of cron on a machine that has access to both AFS and the Netapp snapshots (currently one of our spare application servers) and polls the undelete_queue table for undelete requests
- upon encountering an undelete request, the script searches for the file (knowing the deletion date and file-on-disk name it is a rather easy process to step through the snapshots from most recent to oldest so that the most recent copy is used) and copies it to the support AFS space into a directory named with the ticket number, naming the file with the original names (not GUID) that the user uploaded them with
- the undelete_queue row status is updated appropriately depending on whether the file is successfully undeleted on not (ie., might not have existed when the snapshot was made) and an email is sent to the ticketing system with the result, alerting the support staff so that they can provide the file to the end-user
- apart from some issues with the original file having a UTF-8 name (not all the toolchain is UTF-8 friendly) in which case it is copied preserving the GUID name, there have been remarkably few problems
- stats so far are:
dbmon@ctools.vip> select min(undelete_date) from UNDELETE_QUEUE ;
----------------------+
MIN(UNDELETE_DATE)
----------------------+
2009-01-26 10:23:25.0
----------------------+
1 row in result (first row: 74 msec; total: 74 msec)
dbmon@ctools.vip> select count(*), status from UNDELETE_QUEUE group by status;
---------+----------+
COUNT(*) STATUS
---------+----------+
258 UNDELETED
8 NotFound
8 Exists
---------+----------+
3 rows in result (first row: 42 msec; total: 43 msec) - we probably could increase the number of snapshots (say every 3 hours) though given that only 8 were not found, that doesn't seem worth the extra storage required (though that might still be manageable)
- it would be ideal if Sakai provided user-managed soft-deletes and restores within the application (something promised for JCR backed resources in the future)
- for now, this system has worked remarkably well and saved a lot of sysadmin time with minimal development (took about a week including debugging and testing)