diff scripts/migration/migration_instructions.txt @ 10:a50cf11e5178

Rewrite LGDataverse completely upgrading to dataverse4.0
author Zoe Hong <zhong@mpiwg-berlin.mpg.de>
date Tue, 08 Sep 2015 17:00:21 +0200
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/scripts/migration/migration_instructions.txt	Tue Sep 08 17:00:21 2015 +0200
@@ -0,0 +1,89 @@
+Migration steps:
+
+Assumptions:
+
+- DVN 3.6 networkAdmin has id = 1
+- Dataverse 4.0 admin has id = 1 (created by setup-all.sh script)
+
+
+Pre steps (contained in the migration_presteps document):
+
+-7. Make a copy of the production db, and point an app server to it
+-6. (if there is any data that will fail validation, run scrubbing script - this will need to be custom per installation)
+-5.9 run duplicate user scrubbing scripts
+-5.8 run users as emails scripts
+-5. Export DDI files from 3.6 copy for all datasets to be migrated 
+    (this now includes exporting non-released versions - presteps doc. updated)
+-4. Create copies of tables in 3.6 database for migrated data
+-3. Run pg dump to extract tables copies
+-2. Import copied tables into 4.0 database
+-1. Run offsets on _dvn3_tables in the 4.0 DB
+
+Migration:
+
+1. run migrate_users.sql script
+2. run migrate_dataverses.sql script
+2a. migrate preloaded customizations
+3. run custom_field_map.sql script (this must be updated to contain the custom field mappings specific to 
+    the migration source installation.)
+4. run dataset APIs: execute the following HTTP  request on the Dataverse 4.0 application to initiate dataset migration:
+    
+    http://<hostname>/api/batch/migrate?path=<parent directory of DDI files>&key=<Dataverse Admin API Key>
+
+    This will return a success message and begin an asynchronous migration job - the status of the job is viewable in the import-log file
+    in the Glassfish logs directory.
+
+5. run migrate_datasets.sql script (post migration scrubbing)
+6. run files script: 
+
+a. On the *destination* (4.0) server, step 1
+run the script, and save the output:
+
+./files_destination_step1_ > migrated_datasets.txt
+
+b. On the *source* (3.6) server - 
+run the script on the input produced in a., 
+save the sql output: 
+
+./files_source_ < migrated_datasets.txt > files_import.sql
+
+(the script will also produce the output file packlist.txt, 
+to be used in step d., copying of physical files)
+
+c. On the destination server, import the sql produced in b.:
+
+psql <... params ...> -f files_import.sql
+
+d. Package the files on the source server:
+
+[TODO: there will be a script for this too]
+
+as of now - can be done manually, by tarring the files listed 
+in packlist.txt
+
+e. Unpack the files packaged above in the files directory on 
+the destination server.
+
+7. run migrate_permissions.sql script (may need to delete some duplicates)
+8. run migrate_links.sql script
+
+10. reset sequences
+
+11. (when ready for users to log in) add user passwords
+
+__________________________________________________
+
+Still to be migrated:
+- Guestbook / stats
+
+
+__________________________________________________
+
+Not being migrated (verify?):
+-- Study Comments
+-- File Access requests
+-- Classifications
+-- Study locks
+-- VDCNetworkStats (generated data)
+
+