annotate interface/insert_new_columns_into_books/readme.txt @ 11:3d6fba07bfbd

implemented for topic tag. tagging with topic tag (main tag) indicating each row when export to html.
author Zoe Hong <zhong@mpiwg-berlin.mpg.de>
date Wed, 11 Feb 2015 12:33:59 +0100
parents b12c99b7c3f0
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
1 insert_new_columns_into_books.php parse the book information from localmonographs.xml and insert the result into database
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
2 insert_176_rows_into_books.php insert and update the information for 176 books in the database
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
3
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
4 localmonographs.xml additional book information
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
5 localmonographs.txt the txt version of additional book information
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
6 local_monographs_176.txt the list of 176 books and their information
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
7
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
8 get_data_from_sinica.php parse the book information from the website of sinica and write to files stored under data_from_sinica/
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
9 parse_data_from_sinica.php Group the duplicated books of source 1 and write the results to data_from_sinica/merged_books.csv
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
10 analyze_data_from_sinica.php Count the # of books of each source, and concatenate all the csv from 01-71.csv
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
11 data_from_sinica/ csv files storing book information, encoding in utf8 and big5 (originally encoded format)
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
12 *column_name.csv contains the mapping between column name and source
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
13 all_data.csv contains all the data concatenated from 01-71.csv
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
14 merged_books.csv contains the grouped list of duplicated books
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
15 list_of_local_monographs_from_sinica.xlsx contains the grouping of the duplicated books which are assumed to be the same one, the excel version of merged_books.csv
b12c99b7c3f0 commit for previous development
Zoe Hong <zhong@mpiwg-berlin.mpg.de>
parents:
diff changeset
16