view scripts/migration/migration_instructions.txt @ 10:a50cf11e5178

Rewrite LGDataverse completely upgrading to dataverse4.0
author Zoe Hong <zhong@mpiwg-berlin.mpg.de>
date Tue, 08 Sep 2015 17:00:21 +0200
parents
children
line wrap: on
line source

Migration steps:

Assumptions:

- DVN 3.6 networkAdmin has id = 1
- Dataverse 4.0 admin has id = 1 (created by setup-all.sh script)


Pre steps (contained in the migration_presteps document):

-7. Make a copy of the production db, and point an app server to it
-6. (if there is any data that will fail validation, run scrubbing script - this will need to be custom per installation)
-5.9 run duplicate user scrubbing scripts
-5.8 run users as emails scripts
-5. Export DDI files from 3.6 copy for all datasets to be migrated 
    (this now includes exporting non-released versions - presteps doc. updated)
-4. Create copies of tables in 3.6 database for migrated data
-3. Run pg dump to extract tables copies
-2. Import copied tables into 4.0 database
-1. Run offsets on _dvn3_tables in the 4.0 DB

Migration:

1. run migrate_users.sql script
2. run migrate_dataverses.sql script
2a. migrate preloaded customizations
3. run custom_field_map.sql script (this must be updated to contain the custom field mappings specific to 
    the migration source installation.)
4. run dataset APIs: execute the following HTTP  request on the Dataverse 4.0 application to initiate dataset migration:
    
    http://<hostname>/api/batch/migrate?path=<parent directory of DDI files>&key=<Dataverse Admin API Key>

    This will return a success message and begin an asynchronous migration job - the status of the job is viewable in the import-log file
    in the Glassfish logs directory.

5. run migrate_datasets.sql script (post migration scrubbing)
6. run files script: 

a. On the *destination* (4.0) server, step 1
run the script, and save the output:

./files_destination_step1_ > migrated_datasets.txt

b. On the *source* (3.6) server - 
run the script on the input produced in a., 
save the sql output: 

./files_source_ < migrated_datasets.txt > files_import.sql

(the script will also produce the output file packlist.txt, 
to be used in step d., copying of physical files)

c. On the destination server, import the sql produced in b.:

psql <... params ...> -f files_import.sql

d. Package the files on the source server:

[TODO: there will be a script for this too]

as of now - can be done manually, by tarring the files listed 
in packlist.txt

e. Unpack the files packaged above in the files directory on 
the destination server.

7. run migrate_permissions.sql script (may need to delete some duplicates)
8. run migrate_links.sql script

10. reset sequences

11. (when ready for users to log in) add user passwords

__________________________________________________

Still to be migrated:
- Guestbook / stats


__________________________________________________

Not being migrated (verify?):
-- Study Comments
-- File Access requests
-- Classifications
-- Study locks
-- VDCNetworkStats (generated data)