Mercurial > hg > extraction-interface
view readme.txt @ 27:4a29bccb6c59
modify the SmartRegexSave method to prevent duplicated records in topic_regex_relation table and provide better promting to user to force saving regex file or not
author | Zoe Hong <zhong@mpiwg-berlin.mpg.de> |
---|---|
date | Tue, 03 Mar 2015 11:47:41 +0100 |
parents | b12c99b7c3f0 |
children |
line wrap: on
line source
interface/ local monographs editing check_sections.php find wrongly segmented sections check_sections_details.php re-segment a section, should be mainly accessed from check_sections.php if to be accessed directly, book id and overlapping threshold (i.e. count) should be given in the url e.g. check_sections_details.php?book_id=12345&count=5 insert_new_columns_into_books/ parsing book information from other resources and the results insert_new_columns_into_books.php parse the book information from localmonographs.xml and insert the result into database localmonographs.xml additional book information localmonographs.txt the txt version of additional book information get_data_from_sinica.php parse the book information from the website of sinica and write to files stored under data_from_sinica/ parse_data_from_sinica.php concatenate all the csv from 01-71.csv analyze_data_from_sinica.php Count the # of books of each source data_from_sinica/ csv files storing book information, encoding in utf8 and big5 (originally encoded format) column_name.csv contains the mapping between column name and source all_data.csv contains all the data concatenated from 01-71.csv search/ search.php search for keywords search_locust_temple.php search for keywords and mark the results with tags if they match the locust-temple-related syntax search_function.php included by search.php and search_losuct_temple.php and providing common functions used by these two programs search_results/ all the results from search*.php will be exported to html format and stored in this folder map/ data visualization map.php loads a predefined csv, overlay, and map file and visualize using sebastian's library use ?mode=1 to change the layout about.php the introduction to the program map.php, included by map.php WindowWidget.* javascript class enabling windows within the html document images/ images used by map.php datasets/ datasets, i.e. csv files used by map.php coordinates/ contains programs getting coordinates of each local monograph from CHGIS 1820_1911/ The gis data of china in 1820 and 1911, input of get_coordinates_from_chgis.php csv_files/ The raw data of the results from get_coordinates_from_chgis.php get_coordinates_from_chgis.php The the coordinates of each local monograph listed in local_monographs_list.txt from CHGIS and write the results to csv files under csv_files/. Use the parameter list=176 to change the input to local_monographs_list_176.txt, i.e. get_coordinates_from_chgis.php?list=176 local_monographs_list.txt list of 1824 local monographs, input for get_coordinates_from_chgis.php local_monographs_list_176.txt list of 176 local monographs, input for get_coordinates_from_chgis.php local_monographs_coordinates.html list of 1824 local monographs and their coordinate information, output of get_coordinates_from_chgis.php local_monographs_176_coordinates.html list of 176 local monographs and their coordinate information, output of get_coordinates_from_chgis.php map_input_files/ the copy of the files under csv_files/ as the input for map.php map.php draw the coordinates of the book list on the map map.js the javascript file for map.php map.css the css file for map.css provincial_capital_coordinates.php insert the provincial capital coordinates into datasbase and list all the local monographs along with their provincial capital coordinates provincial_capital_coordinates.txt input for provincial_capital_coordinates.php provincial_capital_coordinates.csv output of provincial_capital_coordinates.php geotemco/ library by sebastian