Gabriel Krisman Bertazi | 955405d | 2019-04-25 13:38:44 -0400 | [diff] [blame] | 1 | The utf8data.h file in this directory is generated from the Unicode |
Gabriel Krisman Bertazi | 1215d23 | 2019-04-25 13:59:17 -0400 | [diff] [blame] | 2 | Character Database for version 12.1.0 of the Unicode standard. |
Gabriel Krisman Bertazi | 955405d | 2019-04-25 13:38:44 -0400 | [diff] [blame] | 3 | |
| 4 | The full set of files can be found here: |
| 5 | |
Gabriel Krisman Bertazi | 1215d23 | 2019-04-25 13:59:17 -0400 | [diff] [blame] | 6 | http://www.unicode.org/Public/12.1.0/ucd/ |
| 7 | |
Gabriel Krisman Bertazi | 955405d | 2019-04-25 13:38:44 -0400 | [diff] [blame] | 8 | Individual source links: |
| 9 | |
Theodore Ts'o | 7fb6413 | 2019-05-12 13:26:08 -0400 | [diff] [blame] | 10 | https://www.unicode.org/Public/12.1.0/ucd/CaseFolding.txt |
| 11 | https://www.unicode.org/Public/12.1.0/ucd/DerivedAge.txt |
| 12 | https://www.unicode.org/Public/12.1.0/ucd/extracted/DerivedCombiningClass.txt |
| 13 | https://www.unicode.org/Public/12.1.0/ucd/DerivedCoreProperties.txt |
| 14 | https://www.unicode.org/Public/12.1.0/ucd/NormalizationCorrections.txt |
| 15 | https://www.unicode.org/Public/12.1.0/ucd/NormalizationTest.txt |
| 16 | https://www.unicode.org/Public/12.1.0/ucd/UnicodeData.txt |
Gabriel Krisman Bertazi | 955405d | 2019-04-25 13:38:44 -0400 | [diff] [blame] | 17 | |
| 18 | md5sums (verify by running "md5sum -c README.utf8data"): |
| 19 | |
Gabriel Krisman Bertazi | 1215d23 | 2019-04-25 13:59:17 -0400 | [diff] [blame] | 20 | 900e76da1d822a160fd6b8c0b1d70094 CaseFolding.txt |
| 21 | 131256380bff4fea8ad4a851616f2f10 DerivedAge.txt |
| 22 | e731a4089b30002144e107e3d6f8d1fa DerivedCombiningClass.txt |
| 23 | a47c9fbd7ff92a9b261ba9831e68778a DerivedCoreProperties.txt |
| 24 | fcab6dad15e440879d92f315978f93d3 NormalizationCorrections.txt |
| 25 | f9ff1c55a60decf436100f791b44aa98 NormalizationTest.txt |
| 26 | 755f6af699f8c8d2d958da411f78f6c6 UnicodeData.txt |
Gabriel Krisman Bertazi | 955405d | 2019-04-25 13:38:44 -0400 | [diff] [blame] | 27 | |
| 28 | sha1sums (verify by running "sha1sum -c README.utf8data"): |
| 29 | |
Gabriel Krisman Bertazi | 1215d23 | 2019-04-25 13:59:17 -0400 | [diff] [blame] | 30 | dc9245f6803c4ac99555c361f5052e0b13eb779b CaseFolding.txt |
| 31 | 3281104f237184cdb5d869e86eb8573678ada7da DerivedAge.txt |
| 32 | 2f5f995ccb96e0fa84b15151b35d5e2681535175 DerivedCombiningClass.txt |
| 33 | 5b8698a3fcd5018e1987f296b02e2c17e696415e DerivedCoreProperties.txt |
| 34 | cd83935fbc012345d8792d2c704f69497e753835 NormalizationCorrections.txt |
| 35 | ea419aae505b337b0d99a83fa83fe58ddff7c19f NormalizationTest.txt |
| 36 | dc973c0fc93d6f09d9ab9f70d1c9f89c447f0526 UnicodeData.txt |
| 37 | |
Gabriel Krisman Bertazi | 955405d | 2019-04-25 13:38:44 -0400 | [diff] [blame] | 38 | |
| 39 | To update to the newer version of the Unicode standard, the latest |
| 40 | released version of the UCD can be found here: |
| 41 | |
| 42 | http://www.unicode.org/Public/UCD/latest/ |
| 43 | |
Masahiro Yamada | 28ba53c | 2019-04-28 13:45:36 -0400 | [diff] [blame] | 44 | Then, build under fs/unicode/ with REGENERATE_UTF8DATA=1: |
Gabriel Krisman Bertazi | 955405d | 2019-04-25 13:38:44 -0400 | [diff] [blame] | 45 | |
Masahiro Yamada | 28ba53c | 2019-04-28 13:45:36 -0400 | [diff] [blame] | 46 | make REGENERATE_UTF8DATA=1 fs/unicode/ |
Gabriel Krisman Bertazi | 955405d | 2019-04-25 13:38:44 -0400 | [diff] [blame] | 47 | |
Masahiro Yamada | 28ba53c | 2019-04-28 13:45:36 -0400 | [diff] [blame] | 48 | After sanity checking the newly generated utf8data.h file (the |
Gabriel Krisman Bertazi | 1215d23 | 2019-04-25 13:59:17 -0400 | [diff] [blame] | 49 | version generated from the 12.1.0 UCD should be 4,109 lines long, and |
| 50 | have a total size of 324k) and/or comparing it with the older version |
Masahiro Yamada | 28ba53c | 2019-04-28 13:45:36 -0400 | [diff] [blame] | 51 | of utf8data.h_shipped, rename it to utf8data.h_shipped. |
Gabriel Krisman Bertazi | 955405d | 2019-04-25 13:38:44 -0400 | [diff] [blame] | 52 | |
| 53 | If you are a kernel developer updating to a newer version of the |
| 54 | Unicode Character Database, please update this README.utf8data file |
| 55 | with the version of the UCD that was used, the md5sum and sha1sums of |
| 56 | the *.txt files, before checking in the new versions of the utf8data.h |
| 57 | and README.utf8data files. |