- Xmlstarlet
- Jq
-This repository is sponsored by Environment and Climate change Canada and
-Wikimedia Canada.
+This repository is sponsored by Wikimedia Canada.
Provided scripts, ordered by chronological usage:
eccc_fixer.sh fix upstream data XML files
eccc_fixer.xslt fix upstream data XML file
commons_rules.xsd validate ECCC XML from a Wikimedian point of view
+eccc_merger.sh merge multiple ECCC XML files
eccc_to_commons.sh transform ECCC XML files into JSON
monthly_to_commons.xslt transform ECCC monthly XML file into JSON
almanac_to_commons.xslt transform ECCC almanac XML file into JSON
-E '^output = ".*/monthly/[A-Z0-9]{7}.xml"$' > downloads_monthly
Remove all downloads before (restart interrupted download):
- $ sed -n '/https:\/\/climate.weather.gc.ca\/climate_data\/bulk_data_e.html?format=xml&timeframe=3&stationID=2606/,$p' \
- downloads_all > download_continue
+ $ sed -n '/https:\/\/climate.weather.gc.ca\/climate_data\/bulk_data_e.html?format=xml&timeframe=3&stationID=2606/,$p' \
+ downloads_all > download_continue
1.3 Download wanted files
every single problem before continuing.
+[OPTIONAL STEP] Merge multiple XML files
+Sometimes, having per station granularity is too accurate. If you need to merge
+two or more XML files, you can use the eccc_merge.sh script:
+
+ $ ./eccc_merger.sh "${ECCC_CACHE}/almanac/3050519.xml" \
+ "${ECCC_CACHE}/almanac/3050520.xml" "${ECCC_CACHE}/almanac/3050521.xml" \
+ "${ECCC_CACHE}/almanac/3050522.xml" "${ECCC_CACHE}/almanac/3050526.xml" \
+ > banff.xml
+
+In order to get stations ids based on their geographical position, you can use
+the eccc_map tool. A public instance is hosted online at
+https://stations.wikimedia.ca/ .
+
+
4. Transform data into target format
Here we are, here is the fun part: let's create weather data in Wikimedia
Commons format.
- $ ./eccc_to_commons "${ECCC_CACHE}" "${COMMONS_CACHE}" 2>log
+ $ ./eccc_to_commons.sh "${ECCC_CACHE}" "${COMMONS_CACHE}" 2>log
It will replicate the future Commons content paths inside nested directories.
So, for example future