X-Git-Url: https://git.wikimedia.ca/?p=eccc_to_commons.git;a=blobdiff_plain;f=README;h=198387187540ad0681a28ba1da727ee6758e4aa6;hp=ca371dca55e57823b9fb61ca6608289c4649de8c;hb=HEAD;hpb=2f3682a6a85c816ba37855f0633478869334c529

diff --git a/README b/README
index ca371dc..1983871 100644
--- a/README
+++ b/README
@@ -12,8 +12,7 @@ distribution. In addition to coreutils, prerequisites are:
 - Xmlstarlet
 - Jq
 
-This repository is sponsored by Environment and Climate change Canada and
-Wikimedia Canada.
+This repository is sponsored by Wikimedia Canada.
 
 
 Provided scripts, ordered by chronological usage:
@@ -22,8 +21,11 @@ dllist.sh                 outputs a curl configuration file listing all availabl
 eccc_fixer.sh             fix upstream data XML files
 eccc_fixer.xslt           fix upstream data XML file
 commons_rules.xsd         validate ECCC XML from a Wikimedian point of view
+eccc_merger.sh            merge multiple ECCC XML files
 eccc_to_commons.sh        transform ECCC XML files into JSON
 monthly_to_commons.xslt   transform ECCC monthly XML file into JSON
+almanac_to_commons.xslt   transform ECCC almanac XML file into JSON
+mediawiki_post.sh         upload directory to a Mediawiki
 
 
 Usage:
@@ -71,8 +73,8 @@ Keep only monthly data:
     -E '^output = ".*/monthly/[A-Z0-9]{7}.xml"$' > downloads_monthly
 
 Remove all downloads before (restart interrupted download):
-	$ sed -n '/https:\/\/climate.weather.gc.ca\/climate_data\/bulk_data_e.html?format=xml&timeframe=3&stationID=2606/,$p' \
-	  downloads_all > download_continue
+  $ sed -n '/https:\/\/climate.weather.gc.ca\/climate_data\/bulk_data_e.html?format=xml&timeframe=3&stationID=2606/,$p' \
+    downloads_all > download_continue
 
 
 1.3 Download wanted files
@@ -129,11 +131,25 @@ Same as previously, the output should be empty. Otherwise, you must resolve
 every single problem before continuing.
 
 
+[OPTIONALÂ STEP] Merge multiple XML files
+Sometimes, having per station granularity is too accurate. If you need to merge
+two or more XML files, you can use the eccc_merge.sh script:
+
+  $ ./eccc_merger.sh "${ECCC_CACHE}/almanac/3050519.xml" \
+    "${ECCC_CACHE}/almanac/3050520.xml" "${ECCC_CACHE}/almanac/3050521.xml" \
+    "${ECCC_CACHE}/almanac/3050522.xml" "${ECCC_CACHE}/almanac/3050526.xml" \
+    > banff.xml
+
+In order to get stations ids based on their geographical position, you can use
+the eccc_map tool. A public instance is hosted online at
+https://stations.wikimedia.ca/ .
+
+
 4. Transform data into target format
 Here we are, here is the fun part: let's create weather data in Wikimedia
 Commons format.
 
-  $ ./eccc_to_commons "${ECCC_CACHE}" "${COMMONS_CACHE}" 2>log
+  $ ./eccc_to_commons.sh "${ECCC_CACHE}" "${COMMONS_CACHE}" 2>log
 
 It will replicate the future Commons content paths inside nested directories.
 So, for example future
@@ -144,4 +160,11 @@ conversion.
 
 
 5. Upload to destination
-Not done yet.
+It's now time to share our work with the world and that's the purpose of the
+mediawiki_post.sh script.
+
+  $ ./mediawiki_post.sh "${COMMONS_CACHE}"
+
+It takes the commons cache as parameter: its file hierarchy will be replicated
+on commons. On first run, it will ask credentials for the Mediawiki account to use to
+perform the import.