mirror of https://github.com/lukechilds/node.git
Browse Source
The two main goals of this change are: - To make it easier to build the Intl option using ICU (particularly, using a newer ICU than v8/Chromium's version) - To enable a much smaller ICU build with only English support The goal here is to get node.js binaries built this way by default so that the Intl API can be used. Additional data can be added at execution time (see Readme and wiki) More details are at https://github.com/joyent/node/pull/7719 In particular, this change adds the "--with-intl=" configure option to provide more ways of building "Intl": - "full-icu" picks up an ICU from deps/icu - "small-icu" is similar, but builds only English - "system-icu" uses pkg-config to find an installed ICU - "none" does nothing (no Intl) For Windows builds, the "full-icu" or "small-icu" options are added to vcbuild.bat. Note that the existing "--with-icu-path" option is not removed from configure, but may not be used alongside the new option. Wiki changes have already been made on https://github.com/joyent/node/wiki/Installation and a new page created at https://github.com/joyent/node/wiki/Intl (marked as provisional until this change lands.) Summary of changes: * README.md : doc updates * .gitignore : added "deps/icu" as this is the location where ICU is unpacked to. * Makefile : added the tools/icu/* files to cpplint, but excluded a problematic file. * configure : added the "--with-intl" option mentioned above. Calculate at config time the list of ICU source files to use and data packaging options. * node.gyp : add the new files src/node_i18n.cc/.h as well as ICU linkage. * src/node.cc : add call into node::i18n::InitializeICUDirectory(icu_data_dir) as well as new --icu-data-dir option and NODE_ICU_DATA env variable to configure ICU data loading. This loading is only relevant in the "small" configuration. * src/node_i18n.cc : new source file for the above Initialize.. function, to setup ICU as needed. * tools/icu : new directory with some tools needed for this build. * tools/icu/icu-generic.gyp : new .gyp file that builds ICU in some new ways, both on unix/mac and windows. * tools/icu/icu-system.gyp : new .gyp file to build node against a pkg-config detected ICU. * tools/icu/icu_small.json : new config file for the "English-only" small build. * tools/icu/icutrim.py : new tool for trimming down ICU data. Reads the above .json file. * tools/icu/iculslocs.cc : new tool for repairing ICU data manifests after trim operation. * tools/icu/no-op.cc : dummy file to force .gyp into using a C++ linker. * vcbuild.bat : added small-icu and full-icu options, to call into configure. * Fixed toolset dependencies, see https://github.com/joyent/node/pull/7719#issuecomment-54641687 Note that because of a bug in gyp {CC,CXX}_host must also be set. Otherwise gcc/g++ will be used by default for part of the build. Reviewed-by: Trevor Norris <trev.norris@gmail.com> Reviewed-by: Fedor Indutny <fedor@indutny.com>v0.11.15-release
Steven R. Loomis
10 years ago
committed by
Trevor Norris
16 changed files with 1605 additions and 20 deletions
@ -0,0 +1,88 @@ |
|||
// Copyright Joyent, Inc. and other Node contributors.
|
|||
//
|
|||
// Permission is hereby granted, free of charge, to any person obtaining a
|
|||
// copy of this software and associated documentation files (the
|
|||
// "Software"), to deal in the Software without restriction, including
|
|||
// without limitation the rights to use, copy, modify, merge, publish,
|
|||
// distribute, sublicense, and/or sell copies of the Software, and to permit
|
|||
// persons to whom the Software is furnished to do so, subject to the
|
|||
// following conditions:
|
|||
//
|
|||
// The above copyright notice and this permission notice shall be included
|
|||
// in all copies or substantial portions of the Software.
|
|||
//
|
|||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
|
|||
// OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
|||
// MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN
|
|||
// NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
|||
// DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
|
|||
// OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
|
|||
// USE OR OTHER DEALINGS IN THE SOFTWARE.
|
|||
|
|||
|
|||
/*
|
|||
* notes: by srl295 |
|||
* - When in NODE_HAVE_SMALL_ICU mode, ICU is linked against "stub" (null) data |
|||
* ( stubdata/libicudata.a ) containing nothing, no data, and it's also |
|||
* linked against a "small" data file which the SMALL_ICUDATA_ENTRY_POINT |
|||
* macro names. That's the "english+root" data. |
|||
* |
|||
* If icu_data_path is non-null, the user has provided a path and we assume |
|||
* it goes somewhere useful. We set that path in ICU, and exit. |
|||
* If icu_data_path is null, they haven't set a path and we want the |
|||
* "english+root" data. We call |
|||
* udata_setCommonData(SMALL_ICUDATA_ENTRY_POINT,...) |
|||
* to load up the english+root data. |
|||
* |
|||
* - when NOT in NODE_HAVE_SMALL_ICU mode, ICU is linked directly with its full |
|||
* data. All of the variables and command line options for changing data at |
|||
* runtime are disabled, as they wouldn't fully override the internal data. |
|||
* See: http://bugs.icu-project.org/trac/ticket/10924
|
|||
*/ |
|||
|
|||
|
|||
#include "node_i18n.h" |
|||
|
|||
#if defined(NODE_HAVE_I18N_SUPPORT) |
|||
|
|||
#include <unicode/putil.h> |
|||
#include <unicode/udata.h> |
|||
|
|||
#ifdef NODE_HAVE_SMALL_ICU |
|||
/* if this is defined, we have a 'secondary' entry point.
|
|||
compare following to utypes.h defs for U_ICUDATA_ENTRY_POINT */ |
|||
#define SMALL_ICUDATA_ENTRY_POINT \ |
|||
SMALL_DEF2(U_ICU_VERSION_MAJOR_NUM, U_LIB_SUFFIX_C_NAME) |
|||
#define SMALL_DEF2(major, suff) SMALL_DEF(major, suff) |
|||
#ifndef U_LIB_SUFFIX_C_NAME |
|||
#define SMALL_DEF(major, suff) icusmdt##major##_dat |
|||
#else |
|||
#define SMALL_DEF(major, suff) icusmdt##suff##major##_dat |
|||
#endif |
|||
|
|||
extern "C" const char U_DATA_API SMALL_ICUDATA_ENTRY_POINT[]; |
|||
#endif |
|||
|
|||
namespace node { |
|||
namespace i18n { |
|||
|
|||
bool InitializeICUDirectory(const char* icu_data_path) { |
|||
if (icu_data_path != NULL) { |
|||
u_setDataDirectory(icu_data_path); |
|||
return true; // no error
|
|||
} else { |
|||
UErrorCode status = U_ZERO_ERROR; |
|||
#ifdef NODE_HAVE_SMALL_ICU |
|||
// install the 'small' data.
|
|||
udata_setCommonData(&SMALL_ICUDATA_ENTRY_POINT, &status); |
|||
#else // !NODE_HAVE_SMALL_ICU
|
|||
// no small data, so nothing to do.
|
|||
#endif // !NODE_HAVE_SMALL_ICU
|
|||
return (status == U_ZERO_ERROR); |
|||
} |
|||
} |
|||
|
|||
} // namespace i18n
|
|||
} // namespace node
|
|||
|
|||
#endif // NODE_HAVE_I18N_SUPPORT
|
@ -0,0 +1,39 @@ |
|||
// Copyright Joyent, Inc. and other Node contributors.
|
|||
//
|
|||
// Permission is hereby granted, free of charge, to any person obtaining a
|
|||
// copy of this software and associated documentation files (the
|
|||
// "Software"), to deal in the Software without restriction, including
|
|||
// without limitation the rights to use, copy, modify, merge, publish,
|
|||
// distribute, sublicense, and/or sell copies of the Software, and to permit
|
|||
// persons to whom the Software is furnished to do so, subject to the
|
|||
// following conditions:
|
|||
//
|
|||
// The above copyright notice and this permission notice shall be included
|
|||
// in all copies or substantial portions of the Software.
|
|||
//
|
|||
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
|
|||
// OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
|||
// MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN
|
|||
// NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
|||
// DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
|
|||
// OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
|
|||
// USE OR OTHER DEALINGS IN THE SOFTWARE.
|
|||
|
|||
#ifndef SRC_NODE_I18N_H_ |
|||
#define SRC_NODE_I18N_H_ |
|||
|
|||
#include "node.h" |
|||
|
|||
#if defined(NODE_HAVE_I18N_SUPPORT) |
|||
|
|||
namespace node { |
|||
namespace i18n { |
|||
|
|||
NODE_EXTERN bool InitializeICUDirectory(const char* icu_data_path); |
|||
|
|||
} // namespace i18n
|
|||
} // namespace node
|
|||
|
|||
#endif // NODE_HAVE_I18N_SUPPORT
|
|||
|
|||
#endif // SRC_NODE_I18N_H_
|
@ -0,0 +1,26 @@ |
|||
Notes about the icu directory. |
|||
=== |
|||
|
|||
The files in this directory were written for the node.js effort. It's |
|||
the intent of their author (Steven R. Loomis / srl295) to merge them |
|||
upstream into ICU, pending much discussion within the ICU-PMC. |
|||
|
|||
`icu_small.json` is somewhat node-specific as it specifies a "small ICU" |
|||
configuration file for the `icutrim.py` script. `icutrim.py` and |
|||
`iculslocs.cpp` may themselves be superseded by components built into |
|||
ICU in the future. |
|||
|
|||
The following tickets were opened during this work, and their |
|||
resolution may inform the reader as to the current state of icu-trim |
|||
upstream: |
|||
|
|||
* [#10919](http://bugs.icu-project.org/trac/ticket/10919) |
|||
(experimental branch - may copy any source patches here) |
|||
* [#10922](http://bugs.icu-project.org/trac/ticket/10922) |
|||
(data packaging improvements) |
|||
* [#10923](http://bugs.icu-project.org/trac/ticket/10923) |
|||
(rewrite data building in python) |
|||
|
|||
When/if components (not including the `.json` file) are merged into |
|||
ICU, this code and `configure` will be updated to detect and use those |
|||
variants rather than the ones in this directory. |
@ -0,0 +1,414 @@ |
|||
# Copyright (c) IBM Corporation and Others. All Rights Reserved. |
|||
# very loosely based on icu.gyp from Chromium: |
|||
# Copyright (c) 2012 The Chromium Authors. All rights reserved. |
|||
# Use of this source code is governed by a BSD-style license that can be |
|||
# found in the LICENSE file. |
|||
|
|||
|
|||
{ |
|||
'variables': { |
|||
'icu_src_derb': [ '../../deps/icu/source/tools/genrb/derb.c' ], |
|||
}, |
|||
'targets': [ |
|||
{ |
|||
# a target to hold uconfig defines. |
|||
# for now these are hard coded, but could be defined. |
|||
'target_name': 'icu_uconfig', |
|||
'type': 'none', |
|||
'toolsets': [ 'host', 'target' ], |
|||
'direct_dependent_settings': { |
|||
'defines': [ |
|||
'UCONFIG_NO_LEGACY_CONVERSION=1', |
|||
'UCONFIG_NO_IDNA=1', |
|||
'UCONFIG_NO_TRANSLITERATION=1', |
|||
'UCONFIG_NO_SERVICE=1', |
|||
'UCONFIG_NO_REGULAR_EXPRESSIONS=1', |
|||
'U_ENABLE_DYLOAD=0', |
|||
'U_STATIC_IMPLEMENTATION=1', |
|||
# TODO(srl295): reenable following pending |
|||
# https://code.google.com/p/v8/issues/detail?id=3345 |
|||
# (saves some space) |
|||
'UCONFIG_NO_BREAK_ITERATION=0', |
|||
], |
|||
} |
|||
}, |
|||
{ |
|||
# a target to hold common settings. |
|||
# make any target that is ICU implementation depend on this. |
|||
'target_name': 'icu_implementation', |
|||
'toolsets': [ 'host', 'target' ], |
|||
'type': 'none', |
|||
'direct_dependent_settings': { |
|||
'conditions': [ |
|||
[ 'os_posix == 1 and OS != "mac" and OS != "ios"', { |
|||
'cflags': [ '-Wno-deprecated-declarations' ], |
|||
'cflags_cc': [ '-frtti' ], |
|||
}], |
|||
[ 'OS == "mac" or OS == "ios"', { |
|||
'xcode_settings': {'GCC_ENABLE_CPP_RTTI': 'YES' }, |
|||
}], |
|||
[ 'OS == "win"', { |
|||
'msvs_settings': { |
|||
'VCCLCompilerTool': {'RuntimeTypeInfo': 'true'}, |
|||
} |
|||
}], |
|||
], |
|||
'msvs_settings': { |
|||
'VCCLCompilerTool': { |
|||
'RuntimeTypeInfo': 'true', |
|||
'ExceptionHandling': '1', |
|||
}, |
|||
}, |
|||
'configurations': { |
|||
# TODO: why does this need to be redefined for Release and Debug? |
|||
# Maybe this should be pushed into common.gypi with an "if v8 i18n"? |
|||
'Release': { |
|||
'msvs_settings': { |
|||
'VCCLCompilerTool': { |
|||
'RuntimeTypeInfo': 'true', |
|||
'ExceptionHandling': '1', |
|||
}, |
|||
}, |
|||
}, |
|||
'Debug': { |
|||
'msvs_settings': { |
|||
'VCCLCompilerTool': { |
|||
'RuntimeTypeInfo': 'true', |
|||
'ExceptionHandling': '1', |
|||
}, |
|||
}, |
|||
}, |
|||
}, |
|||
'defines': [ |
|||
'U_ATTRIBUTE_DEPRECATED=', |
|||
'_CRT_SECURE_NO_DEPRECATE=', |
|||
'U_STATIC_IMPLEMENTATION=1', |
|||
], |
|||
}, |
|||
}, |
|||
{ |
|||
'target_name': 'icui18n', |
|||
'type': '<(library)', |
|||
'toolsets': [ 'host', 'target' ], |
|||
'sources': [ |
|||
'<@(icu_src_i18n)' |
|||
], |
|||
'include_dirs': [ |
|||
'../../deps/icu/source/i18n', |
|||
], |
|||
'defines': [ |
|||
'U_I18N_IMPLEMENTATION=1', |
|||
], |
|||
'dependencies': [ 'icuucx', 'icu_implementation', 'icu_uconfig' ], |
|||
'direct_dependent_settings': { |
|||
'include_dirs': [ |
|||
'../../deps/icu/source/i18n', |
|||
], |
|||
}, |
|||
'export_dependent_settings': [ 'icuucx' ], |
|||
}, |
|||
# this library is only built for derb.. |
|||
{ |
|||
'target_name': 'icuio', |
|||
'type': '<(library)', |
|||
'toolsets': [ 'host' ], |
|||
'sources': [ |
|||
'<@(icu_src_io)' |
|||
], |
|||
'include_dirs': [ |
|||
'../../deps/icu/source/io', |
|||
], |
|||
'defines': [ |
|||
'U_IO_IMPLEMENTATION=1', |
|||
], |
|||
'dependencies': [ 'icuucx', 'icui18n', 'icu_implementation', 'icu_uconfig' ], |
|||
'direct_dependent_settings': { |
|||
'include_dirs': [ |
|||
'../../deps/icu/source/io', |
|||
], |
|||
}, |
|||
'export_dependent_settings': [ 'icuucx', 'icui18n' ], |
|||
}, |
|||
# This exports actual ICU data |
|||
{ |
|||
'target_name': 'icudata', |
|||
'type': '<(library)', |
|||
'toolsets': [ 'target' ], |
|||
'conditions': [ |
|||
[ 'OS == "win"', { |
|||
'conditions': [ |
|||
[ 'icu_small == "false"', { # and OS=win |
|||
# full data - just build the full data file, then we are done. |
|||
'sources': [ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major)<(icu_endianness)_dat.obj' ], |
|||
'dependencies': [ 'genccode#host' ], |
|||
'actions': [ |
|||
{ |
|||
'action_name': 'icudata', |
|||
'inputs': [ '<(icu_data_in)' ], |
|||
'outputs': [ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major)<(icu_endianness)_dat.obj' ], |
|||
'action': [ '<(PRODUCT_DIR)/genccode', |
|||
'-o', |
|||
'-d', '<(SHARED_INTERMEDIATE_DIR)', |
|||
'-n', 'icudata', |
|||
'-e', 'icudt<(icu_ver_major)', |
|||
'<@(_inputs)' ], |
|||
}, |
|||
], |
|||
}, { # icu_small == TRUE and OS == win |
|||
# link against stub data primarily |
|||
# then, use icupkg and genccode to rebuild data |
|||
'dependencies': [ 'icustubdata', 'genccode#host', 'icupkg#host', 'genrb#host', 'iculslocs#host' ], |
|||
'export_dependent_settings': [ 'icustubdata' ], |
|||
'actions': [ |
|||
{ |
|||
# trim down ICU |
|||
'action_name': 'icutrim', |
|||
'inputs': [ '<(icu_data_in)', 'icu_small.json' ], |
|||
'outputs': [ '../../out/icutmp/icudt<(icu_ver_major)<(icu_endianness).dat' ], |
|||
'action': [ 'python', |
|||
'icutrim.py', |
|||
'-P', '../../<(CONFIGURATION_NAME)', |
|||
'-D', '<(icu_data_in)', |
|||
'--delete-tmp', |
|||
'-T', '../../out/icutmp', |
|||
'-F', 'icu_small.json', |
|||
'-O', 'icudt<(icu_ver_major)<(icu_endianness).dat', |
|||
'-v' ], |
|||
}, |
|||
{ |
|||
# build final .dat -> .obj |
|||
'action_name': 'genccode', |
|||
'inputs': [ '../../out/icutmp/icudt<(icu_ver_major)<(icu_endianness).dat' ], |
|||
'outputs': [ '../../out/icudt<(icu_ver_major)<(icu_endianness)_dat.obj' ], |
|||
'action': [ '../../<(CONFIGURATION_NAME)/genccode', |
|||
'-o', |
|||
'-d', '../../out/', |
|||
'-n', 'icudata', |
|||
'-e', 'icusmdt<(icu_ver_major)', |
|||
'<@(_inputs)' ], |
|||
}, |
|||
], |
|||
# This file contains the small ICU data. |
|||
'sources': [ '../../out/icudt<(icu_ver_major)<(icu_endianness)_dat.obj' ], |
|||
} ] ], #end of OS==win and icu_small == true |
|||
}, { # OS != win |
|||
'conditions': [ |
|||
[ 'icu_small == "false"', { |
|||
# full data - just build the full data file, then we are done. |
|||
'sources': [ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major)_dat.c' ], |
|||
'dependencies': [ 'genccode#host', 'icupkg#host', 'icu_implementation#host', 'icu_uconfig' ], |
|||
'include_dirs': [ |
|||
'../../deps/icu/source/common', |
|||
], |
|||
'actions': [ |
|||
{ |
|||
# Swap endianness (if needed), or at least copy the file |
|||
'action_name': 'icupkg', |
|||
'inputs': [ '<(icu_data_in)' ], |
|||
'outputs':[ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major)<(icu_endianness).dat' ], |
|||
'action': [ '<(PRODUCT_DIR)/icupkg', |
|||
'-t<(icu_endianness)', |
|||
'<@(_inputs)', |
|||
'<@(_outputs)', |
|||
], |
|||
}, |
|||
{ |
|||
# Rename without the endianness marker |
|||
'action_name': 'copy', |
|||
'inputs': [ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major)<(icu_endianness).dat' ], |
|||
'outputs':[ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major).dat' ], |
|||
'action': [ 'cp', |
|||
'<@(_inputs)', |
|||
'<@(_outputs)', |
|||
], |
|||
}, |
|||
{ |
|||
'action_name': 'icudata', |
|||
'inputs': [ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major).dat' ], |
|||
'outputs':[ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major)_dat.c' ], |
|||
'action': [ '<(PRODUCT_DIR)/genccode', |
|||
'-e', 'icudt<(icu_ver_major)', |
|||
'-d', '<(SHARED_INTERMEDIATE_DIR)', |
|||
'-f', 'icudt<(icu_ver_major)_dat', |
|||
'<@(_inputs)' ], |
|||
}, |
|||
], # end actions |
|||
}, { # icu_small == true ( and OS != win ) |
|||
# link against stub data (as primary data) |
|||
# then, use icupkg and genccode to rebuild small data |
|||
'dependencies': [ 'icustubdata', 'genccode#host', 'icupkg#host', 'genrb#host', 'iculslocs#host', |
|||
'icu_implementation', 'icu_uconfig' ], |
|||
'export_dependent_settings': [ 'icustubdata' ], |
|||
'actions': [ |
|||
{ |
|||
# trim down ICU |
|||
'action_name': 'icutrim', |
|||
'inputs': [ '<(icu_data_in)', 'icu_small.json' ], |
|||
'outputs': [ '<(SHARED_INTERMEDIATE_DIR)/icutmp/icudt<(icu_ver_major)<(icu_endianness).dat' ], |
|||
'action': [ 'python', |
|||
'icutrim.py', |
|||
'-P', '<(PRODUCT_DIR)', |
|||
'-D', '<(icu_data_in)', |
|||
'--delete-tmp', |
|||
'-T', '<(SHARED_INTERMEDIATE_DIR)/icutmp', |
|||
'-F', 'icu_small.json', |
|||
'-O', 'icudt<(icu_ver_major)<(icu_endianness).dat', |
|||
'-v' ], |
|||
}, { |
|||
# rename to get the final entrypoint name right |
|||
'action_name': 'rename', |
|||
'inputs': [ '<(SHARED_INTERMEDIATE_DIR)/icutmp/icudt<(icu_ver_major)<(icu_endianness).dat' ], |
|||
'outputs': [ '<(SHARED_INTERMEDIATE_DIR)/icutmp/icusmdt<(icu_ver_major).dat' ], |
|||
'action': [ 'cp', |
|||
'<@(_inputs)', |
|||
'<@(_outputs)', |
|||
], |
|||
}, { |
|||
# build final .dat -> .obj |
|||
'action_name': 'genccode', |
|||
'inputs': [ '<(SHARED_INTERMEDIATE_DIR)/icutmp/icusmdt<(icu_ver_major).dat' ], |
|||
'outputs': [ '<(SHARED_INTERMEDIATE_DIR)/icusmdt<(icu_ver_major)_dat.c' ], |
|||
'action': [ '<(PRODUCT_DIR)/genccode', |
|||
'-d', '<(SHARED_INTERMEDIATE_DIR)', |
|||
'<@(_inputs)' ], |
|||
}, |
|||
], |
|||
# This file contains the small ICU data |
|||
'sources': [ '<(SHARED_INTERMEDIATE_DIR)/icusmdt<(icu_ver_major)_dat.c' ], |
|||
# for umachine.h |
|||
'include_dirs': [ |
|||
'../../deps/icu/source/common', |
|||
], |
|||
}]], # end icu_small == true |
|||
}]], # end OS != win |
|||
}, # end icudata |
|||
# icustubdata is a tiny (~1k) symbol with no ICU data in it. |
|||
# tools must link against it as they are generating the full data. |
|||
{ |
|||
'target_name': 'icustubdata', |
|||
'type': '<(library)', |
|||
'toolsets': [ 'host', 'target' ], |
|||
'dependencies': [ 'icu_implementation' ], |
|||
'sources': [ |
|||
'<@(icu_src_stubdata)' |
|||
], |
|||
'include_dirs': [ |
|||
'../../deps/icu/source/common', |
|||
], |
|||
}, |
|||
# this target is for v8 consumption. |
|||
# it is icuuc + stubdata |
|||
# it is only built for target |
|||
{ |
|||
'target_name': 'icuuc', |
|||
'type': 'none', |
|||
'toolsets': [ 'target' ], |
|||
'dependencies': [ 'icuucx', 'icudata' ], |
|||
'export_dependent_settings': [ 'icuucx', 'icudata' ], |
|||
}, |
|||
# This is the 'real' icuuc. |
|||
# tools can depend on 'icuuc + stubdata' |
|||
{ |
|||
'target_name': 'icuucx', |
|||
'type': '<(library)', |
|||
'dependencies': [ 'icu_implementation', 'icu_uconfig' ], |
|||
'toolsets': [ 'host', 'target' ], |
|||
'sources': [ |
|||
'<@(icu_src_common)' |
|||
], |
|||
'include_dirs': [ |
|||
'../../deps/icu/source/common', |
|||
], |
|||
'defines': [ |
|||
'U_COMMON_IMPLEMENTATION=1', |
|||
], |
|||
'export_dependent_settings': [ 'icu_uconfig' ], |
|||
'direct_dependent_settings': { |
|||
'include_dirs': [ |
|||
'../../deps/icu/source/common', |
|||
], |
|||
'conditions': [ |
|||
[ 'OS=="win"', { |
|||
'link_settings': { |
|||
'libraries': [ '-lAdvAPI32.Lib', '-lUser32.lib' ], |
|||
}, |
|||
}], |
|||
], |
|||
}, |
|||
}, |
|||
# tools library |
|||
{ |
|||
'target_name': 'icutools', |
|||
'type': '<(library)', |
|||
'toolsets': [ 'host' ], |
|||
'dependencies': [ 'icuucx', 'icui18n', 'icustubdata' ], |
|||
'sources': [ |
|||
'<@(icu_src_tools)' |
|||
], |
|||
'include_dirs': [ |
|||
'../../deps/icu/source/tools/toolutil', |
|||
], |
|||
'defines': [ |
|||
'U_TOOLUTIL_IMPLEMENTATION=1', |
|||
#'DEBUG=0', # http://bugs.icu-project.org/trac/ticket/10977 |
|||
], |
|||
'direct_dependent_settings': { |
|||
'include_dirs': [ |
|||
'../../deps/icu/source/tools/toolutil', |
|||
], |
|||
}, |
|||
'export_dependent_settings': [ 'icuucx', 'icui18n', 'icustubdata' ], |
|||
}, |
|||
# This tool is needed to rebuild .res files from .txt, |
|||
# or to build index (res_index.txt) files for small-icu |
|||
{ |
|||
'target_name': 'genrb', |
|||
'type': 'executable', |
|||
'toolsets': [ 'host' ], |
|||
'dependencies': [ 'icutools', 'icuucx', 'icui18n' ], |
|||
'sources': [ |
|||
'<@(icu_src_genrb)' |
|||
], |
|||
# derb is a separate executable |
|||
# (which is not currently built) |
|||
'sources!': [ |
|||
'<@(icu_src_derb)', |
|||
'no-op.cc', |
|||
], |
|||
}, |
|||
# This tool is used to rebuild res_index.res manifests |
|||
{ |
|||
'target_name': 'iculslocs', |
|||
'toolsets': [ 'host' ], |
|||
'type': 'executable', |
|||
'dependencies': [ 'icutools', 'icuucx', 'icui18n', 'icuio' ], |
|||
'sources': [ |
|||
'iculslocs.cc', |
|||
'no-op.cc', |
|||
], |
|||
}, |
|||
# This tool is used to package, unpackage, repackage .dat files |
|||
# and convert endianesses |
|||
{ |
|||
'target_name': 'icupkg', |
|||
'toolsets': [ 'host' ], |
|||
'type': 'executable', |
|||
'dependencies': [ 'icutools', 'icuucx', 'icui18n' ], |
|||
'sources': [ |
|||
'<@(icu_src_icupkg)', |
|||
'no-op.cc', |
|||
], |
|||
}, |
|||
# this is used to convert .dat directly into .obj |
|||
{ |
|||
'target_name': 'genccode', |
|||
'toolsets': [ 'host' ], |
|||
'type': 'executable', |
|||
'dependencies': [ 'icutools', 'icuucx', 'icui18n' ], |
|||
'sources': [ |
|||
'<@(icu_src_genccode)', |
|||
'no-op.cc', |
|||
], |
|||
}, |
|||
], |
|||
} |
@ -0,0 +1,18 @@ |
|||
# Copyright (c) 2014 IBM Corporation and Others. All Rights Reserved. |
|||
|
|||
# This variant is used for the '--with-intl=system-icu' option. |
|||
# 'configure' has already set 'libs' and 'cflags' - so, |
|||
# there's nothing to do in these targets. |
|||
|
|||
{ |
|||
'targets': [ |
|||
{ |
|||
'target_name': 'icuuc', |
|||
'type': 'none', |
|||
}, |
|||
{ |
|||
'target_name': 'icui18n', |
|||
'type': 'none', |
|||
}, |
|||
], |
|||
} |
@ -0,0 +1,47 @@ |
|||
{ |
|||
"copyright": "Copyright (c) 2014 IBM Corporation and Others. All Rights Reserved.", |
|||
"comment": "icutrim.py config: Trim down ICU to just English, needed for node.js use.", |
|||
"variables": { |
|||
"none": { |
|||
"only": [] |
|||
}, |
|||
"en_only": { |
|||
"only": [ |
|||
"root", |
|||
"en" |
|||
] |
|||
}, |
|||
"leavealone": { |
|||
} |
|||
}, |
|||
"trees": { |
|||
"ROOT": "en_only", |
|||
"brkitr": "none", |
|||
"coll": "en_only", |
|||
"curr": "en_only", |
|||
"lang": "none", |
|||
"rbnf": "none", |
|||
"region": "none", |
|||
"zone": "en_only", |
|||
"converters": "none", |
|||
"stringprep": "none", |
|||
"translit": "none", |
|||
"brkfiles": "none", |
|||
"brkdict": "none", |
|||
"confusables": "none" |
|||
}, |
|||
"remove": [ |
|||
"cnvalias.icu", |
|||
"postalCodeData.res", |
|||
"uts46.nrm", |
|||
"genderList.res", |
|||
"brkitr/root.res", |
|||
"unames.icu" |
|||
], |
|||
"keep": [ |
|||
"pool.res", |
|||
"supplementalData.res", |
|||
"zoneinfo64.res", |
|||
"likelySubtags.res" |
|||
] |
|||
} |
@ -0,0 +1,388 @@ |
|||
/*
|
|||
********************************************************************** |
|||
* Copyright (C) 2014, International Business Machines |
|||
* Corporation and others. All Rights Reserved. |
|||
********************************************************************** |
|||
* |
|||
* Created 2014-06-20 by Steven R. Loomis |
|||
* |
|||
* See: http://bugs.icu-project.org/trac/ticket/10922
|
|||
* |
|||
*/ |
|||
|
|||
/*
|
|||
WHAT IS THIS? |
|||
|
|||
Here's the problem: It's difficult to reconfigure ICU from the command |
|||
line without using the full makefiles. You can do a lot, but not |
|||
everything. |
|||
|
|||
Consider: |
|||
|
|||
$ icupkg -r 'ja*' icudt53l.dat |
|||
|
|||
Great, you've now removed the (main) Japanese data. But something's |
|||
still wrong-- res_index (and thus, getAvailable* functions) still |
|||
claim the locale is present. |
|||
|
|||
You are reading the source to a tool (using only public API C code) |
|||
that can solve this problem. Use as follows: |
|||
|
|||
$ iculslocs -i . -N icudt53l -b res_index.txt |
|||
|
|||
.. Generates a NEW res_index.txt (by looking at the .dat file, and |
|||
figuring out which locales are actually available. Has commented out |
|||
the ones which are no longer available: |
|||
|
|||
... |
|||
it_SM {""} |
|||
// ja {""}
|
|||
// ja_JP {""}
|
|||
jgo {""} |
|||
... |
|||
|
|||
Then you can build and in-place patch it with existing ICU tools: |
|||
$ genrb res_index.txt |
|||
$ icupkg -a res_index.res icudt53l.dat |
|||
|
|||
.. Now you have a patched icudt539.dat that not only doesn't have |
|||
Japanese, it doesn't *claim* to have Japanese. |
|||
|
|||
*/ |
|||
|
|||
#include "string.h" |
|||
#include "charstr.h" // ICU internal header |
|||
#include <unicode/ustdio.h> |
|||
#include <unicode/ures.h> |
|||
#include <unicode/udata.h> |
|||
|
|||
const char* PROG = "iculslocs"; |
|||
const char* NAME = U_ICUDATA_NAME; // assume ICU data
|
|||
const char* TREE = "ROOT"; |
|||
int VERBOSE = 0; |
|||
|
|||
#define RES_INDEX "res_index" |
|||
#define INSTALLEDLOCALES "InstalledLocales" |
|||
|
|||
CharString packageName; |
|||
const char* locale = RES_INDEX; // locale referring to our index
|
|||
|
|||
void usage() { |
|||
u_printf("Usage: %s [options]\n", PROG); |
|||
u_printf( |
|||
"This program lists and optionally regenerates the locale " |
|||
"manifests\n" |
|||
" in ICU 'res_index.res' files.\n"); |
|||
u_printf( |
|||
" -i ICUDATA Set ICUDATA dir to ICUDATA.\n" |
|||
" NOTE: this must be the first option given.\n"); |
|||
u_printf(" -h This Help\n"); |
|||
u_printf(" -v Verbose Mode on\n"); |
|||
u_printf(" -l List locales to stdout\n"); |
|||
u_printf( |
|||
" if Verbose mode, then missing (unopenable)" |
|||
"locales\n" |
|||
" will be listed preceded by a '#'.\n"); |
|||
u_printf( |
|||
" -b res_index.txt Write 'corrected' bundle " |
|||
"to res_index.txt\n" |
|||
" missing bundles will be " |
|||
"OMITTED\n"); |
|||
u_printf( |
|||
" -T TREE Choose tree TREE\n" |
|||
" (TREE should be one of: \n" |
|||
" ROOT, brkitr, coll, curr, lang, rbnf, region, zone)\n"); |
|||
// see ureslocs.h and elsewhere
|
|||
u_printf( |
|||
" -N NAME Choose name NAME\n" |
|||
" (default: '%s')\n", |
|||
U_ICUDATA_NAME); |
|||
u_printf( |
|||
"\nNOTE: for best results, this tool ought to be " |
|||
"linked against\n" |
|||
"stubdata. i.e. '%s -l' SHOULD return an error with " |
|||
" no data.\n", |
|||
PROG); |
|||
} |
|||
|
|||
#define ASSERT_SUCCESS(what) \ |
|||
if (U_FAILURE(status)) { \ |
|||
u_printf("%s:%d: %s: ERROR: %s %s\n", \ |
|||
__FILE__, \ |
|||
__LINE__, \ |
|||
PROG, \ |
|||
u_errorName(status), \ |
|||
what); \ |
|||
return 1; \ |
|||
} |
|||
|
|||
/**
|
|||
* @param status changed from reference to pointer to match node.js style |
|||
*/ |
|||
void calculatePackageName(UErrorCode* status) { |
|||
packageName.clear(); |
|||
if (strcmp(NAME, "NONE")) { |
|||
packageName.append(NAME, *status); |
|||
if (strcmp(TREE, "ROOT")) { |
|||
packageName.append(U_TREE_SEPARATOR_STRING, *status); |
|||
packageName.append(TREE, *status); |
|||
} |
|||
} |
|||
if (VERBOSE) { |
|||
u_printf("packageName: %s\n", packageName.data()); |
|||
} |
|||
} |
|||
|
|||
/**
|
|||
* Does the locale exist? |
|||
* return zero for false, or nonzero if it was openable. |
|||
* Assumes calculatePackageName was called. |
|||
* @param exists set to TRUE if exists, FALSE otherwise. |
|||
* Changed from reference to pointer to match node.js style |
|||
* @return 0 on "OK" (success or resource-missing), |
|||
* 1 on "FAILURE" (unexpected error) |
|||
*/ |
|||
int localeExists(const char* loc, UBool* exists) { |
|||
UErrorCode status = U_ZERO_ERROR; |
|||
if (VERBOSE > 1) { |
|||
u_printf("Trying to open %s:%s\n", packageName.data(), loc); |
|||
} |
|||
LocalUResourceBundlePointer aResource( |
|||
ures_openDirect(packageName.data(), loc, &status)); |
|||
*exists = FALSE; |
|||
if (U_SUCCESS(status)) { |
|||
*exists = true; |
|||
if (VERBOSE > 1) { |
|||
u_printf("%s:%s existed!\n", packageName.data(), loc); |
|||
} |
|||
return 0; |
|||
} else if (status == U_MISSING_RESOURCE_ERROR) { |
|||
*exists = false; |
|||
if (VERBOSE > 1) { |
|||
u_printf("%s:%s did NOT exist (%s)!\n", |
|||
packageName.data(), |
|||
loc, |
|||
u_errorName(status)); |
|||
} |
|||
return 0; // "good" failure
|
|||
} else { |
|||
// some other failure..
|
|||
u_printf("%s:%d: %s: ERROR %s opening %s:%s for test.\n", |
|||
__FILE__, |
|||
__LINE__, |
|||
u_errorName(status), |
|||
packageName.data(), |
|||
loc); |
|||
return 1; // abort
|
|||
} |
|||
} |
|||
|
|||
void printIndent(const LocalUFILEPointer& bf, int indent) { |
|||
for (int i = 0; i < indent + 1; i++) { |
|||
u_fprintf(bf.getAlias(), " "); |
|||
} |
|||
} |
|||
|
|||
/**
|
|||
* Dumps a table resource contents |
|||
* if lev==0, skips INSTALLEDLOCALES |
|||
* @return 0 for OK, 1 for err |
|||
*/ |
|||
int dumpAllButInstalledLocales(int lev, |
|||
LocalUResourceBundlePointer& bund, |
|||
LocalUFILEPointer& bf, |
|||
UErrorCode& status) { |
|||
ures_resetIterator(bund.getAlias()); |
|||
const UBool isTable = (UBool)(ures_getType(bund.getAlias()) == URES_TABLE); |
|||
LocalUResourceBundlePointer t; |
|||
while (U_SUCCESS(status) && ures_hasNext(bund.getAlias())) { |
|||
t.adoptInstead(ures_getNextResource(bund.getAlias(), t.orphan(), &status)); |
|||
ASSERT_SUCCESS("while processing table"); |
|||
const char* key = ures_getKey(t.getAlias()); |
|||
if (VERBOSE > 1) { |
|||
u_printf("dump@%d: got key %s\n", lev, key); |
|||
} |
|||
if (lev == 0 && !strcmp(key, INSTALLEDLOCALES)) { |
|||
if (VERBOSE > 1) { |
|||
u_printf("dump: skipping '%s' as it must be evaluated.\n", key); |
|||
} |
|||
} else { |
|||
printIndent(bf, lev); |
|||
u_fprintf(bf.getAlias(), "%s", key); |
|||
switch (ures_getType(t.getAlias())) { |
|||
case URES_STRING: { |
|||
int32_t len = 0; |
|||
const UChar* s = ures_getString(t.getAlias(), &len, &status); |
|||
ASSERT_SUCCESS("getting string"); |
|||
u_fprintf(bf.getAlias(), ":string {\""); |
|||
u_file_write(s, len, bf.getAlias()); |
|||
u_fprintf(bf.getAlias(), "\"}"); |
|||
} break; |
|||
default: { |
|||
u_printf("ERROR: unhandled type in dumpAllButInstalledLocales().\n"); |
|||
return 1; |
|||
} break; |
|||
} |
|||
u_fprintf(bf.getAlias(), "\n"); |
|||
} |
|||
} |
|||
return 0; |
|||
} |
|||
|
|||
int list(const char* toBundle) { |
|||
UErrorCode status = U_ZERO_ERROR; |
|||
|
|||
LocalUFILEPointer bf; |
|||
|
|||
if (toBundle != NULL) { |
|||
if (VERBOSE) { |
|||
u_printf("writing to bundle %s\n", toBundle); |
|||
} |
|||
// we write UTF-8 with BOM only. No exceptions.
|
|||
bf.adoptInstead(u_fopen(toBundle, "w", "en_US_POSIX", "UTF-8")); |
|||
if (bf.isNull()) { |
|||
u_printf("ERROR: Could not open '%s' for writing.\n", toBundle); |
|||
return 1; |
|||
} |
|||
u_fputc(0xFEFF, bf.getAlias()); // write BOM
|
|||
u_fprintf(bf.getAlias(), "// -*- Coding: utf-8; -*-\n//\n"); |
|||
} |
|||
|
|||
// first, calculate the bundle name.
|
|||
calculatePackageName(&status); |
|||
ASSERT_SUCCESS("calculating package name"); |
|||
|
|||
if (VERBOSE) { |
|||
u_printf("\"locale\": %s\n", locale); |
|||
} |
|||
|
|||
LocalUResourceBundlePointer bund( |
|||
ures_openDirect(packageName.data(), locale, &status)); |
|||
ASSERT_SUCCESS("while opening the bundle"); |
|||
LocalUResourceBundlePointer installedLocales( |
|||
ures_getByKey(bund.getAlias(), INSTALLEDLOCALES, NULL, &status)); |
|||
ASSERT_SUCCESS("while fetching installed locales"); |
|||
|
|||
int32_t count = ures_getSize(installedLocales.getAlias()); |
|||
if (VERBOSE) { |
|||
u_printf("Locales: %d\n", count); |
|||
} |
|||
|
|||
if (bf.isValid()) { |
|||
// write the HEADER
|
|||
u_fprintf(bf.getAlias(), |
|||
"// Warning this file is automatically generated\n" |
|||
"// Updated by %s based on %s:%s.txt\n", |
|||
PROG, |
|||
packageName.data(), |
|||
locale); |
|||
u_fprintf(bf.getAlias(), |
|||
"%s:table(nofallback) {\n" |
|||
" // First, everything besides InstalledLocales:\n", |
|||
locale); |
|||
if (dumpAllButInstalledLocales(0, bund, bf, status)) { |
|||
u_printf("Error dumping prolog for %s\n", toBundle); |
|||
return 1; |
|||
} |
|||
ASSERT_SUCCESS("while writing prolog"); // in case an error was missed
|
|||
|
|||
u_fprintf(bf.getAlias(), |
|||
" %s:table { // %d locales in input %s.res\n", |
|||
INSTALLEDLOCALES, |
|||
count, |
|||
locale); |
|||
} |
|||
|
|||
// OK, now list them.
|
|||
LocalUResourceBundlePointer subkey; |
|||
|
|||
int validCount = 0; |
|||
for (int32_t i = 0; i < count; i++) { |
|||
subkey.adoptInstead(ures_getByIndex( |
|||
installedLocales.getAlias(), i, subkey.orphan(), &status)); |
|||
ASSERT_SUCCESS("while fetching an installed locale's name"); |
|||
|
|||
const char* key = ures_getKey(subkey.getAlias()); |
|||
if (VERBOSE > 1) { |
|||
u_printf("@%d: %s\n", i, key); |
|||
} |
|||
// now, see if the locale is installed..
|
|||
|
|||
UBool exists; |
|||
if (localeExists(key, &exists)) { |
|||
return 1; // get out.
|
|||
} |
|||
if (exists) { |
|||
validCount++; |
|||
u_printf("%s\n", key); |
|||
if (bf.isValid()) { |
|||
u_fprintf(bf.getAlias(), " %s {\"\"}\n", key); |
|||
} |
|||
} else { |
|||
if (bf.isValid()) { |
|||
u_fprintf(bf.getAlias(), "// %s {\"\"}\n", key); |
|||
} |
|||
if (VERBOSE) { |
|||
u_printf("#%s\n", key); // verbosity one - '' vs '#'
|
|||
} |
|||
} |
|||
} |
|||
|
|||
if (bf.isValid()) { |
|||
u_fprintf(bf.getAlias(), " } // %d/%d valid\n", validCount, count); |
|||
// write the HEADER
|
|||
u_fprintf(bf.getAlias(), "}\n"); |
|||
} |
|||
return 0; |
|||
} |
|||
|
|||
int main(int argc, const char* argv[]) { |
|||
PROG = argv[0]; |
|||
for (int i = 1; i < argc; i++) { |
|||
const char* arg = argv[i]; |
|||
int argsLeft = argc - i - 1; /* how many remain? */ |
|||
if (!strcmp(arg, "-v")) { |
|||
VERBOSE++; |
|||
} else if (!strcmp(arg, "-i") && (argsLeft >= 1)) { |
|||
if (i != 1) { |
|||
u_printf("ERROR: -i must be the first argument given.\n"); |
|||
usage(); |
|||
return 1; |
|||
} |
|||
const char* dir = argv[++i]; |
|||
u_setDataDirectory(dir); |
|||
if (VERBOSE) { |
|||
u_printf("ICUDATA is now %s\n", dir); |
|||
} |
|||
} else if (!strcmp(arg, "-T") && (argsLeft >= 1)) { |
|||
TREE = argv[++i]; |
|||
if (VERBOSE) { |
|||
u_printf("TREE is now %s\n", TREE); |
|||
} |
|||
} else if (!strcmp(arg, "-N") && (argsLeft >= 1)) { |
|||
NAME = argv[++i]; |
|||
if (VERBOSE) { |
|||
u_printf("NAME is now %s\n", NAME); |
|||
} |
|||
} else if (!strcmp(arg, "-?") || !strcmp(arg, "-h")) { |
|||
usage(); |
|||
return 0; |
|||
} else if (!strcmp(arg, "-l")) { |
|||
if (list(NULL)) { |
|||
return 1; |
|||
} |
|||
} else if (!strcmp(arg, "-b") && (argsLeft >= 1)) { |
|||
if (list(argv[++i])) { |
|||
return 1; |
|||
} |
|||
} else { |
|||
u_printf("Unknown or malformed option: %s\n", arg); |
|||
usage(); |
|||
return 1; |
|||
} |
|||
} |
|||
} |
|||
|
|||
// Local Variables:
|
|||
// compile-command: "icurun iculslocs.cpp"
|
|||
// End:
|
@ -0,0 +1,338 @@ |
|||
#!/usr/bin/python |
|||
# |
|||
# Copyright (C) 2014 IBM Corporation and Others. All Rights Reserved. |
|||
# |
|||
# @author Steven R. Loomis <srl@icu-project.org> |
|||
# |
|||
# This tool slims down an ICU data (.dat) file according to a config file. |
|||
# |
|||
# See: http://bugs.icu-project.org/trac/ticket/10922 |
|||
# |
|||
# Usage: |
|||
# Use "-h" to get help options. |
|||
|
|||
import sys |
|||
import shutil |
|||
# for utf-8 |
|||
reload(sys) |
|||
sys.setdefaultencoding("utf-8") |
|||
|
|||
import argparse |
|||
import os |
|||
import json |
|||
import re |
|||
|
|||
endian=sys.byteorder |
|||
|
|||
parser = argparse.ArgumentParser(description="ICU Datafile repackager. Example of use: \"mkdir tmp ; python icutrim.py -D ~/Downloads/icudt53l.dat -T tmp -F trim_en.json -O icudt53l.dat\" you will then find a smaller icudt53l.dat in 'tmp'. ", |
|||
epilog="ICU tool, http://icu-project.org - master copy at http://source.icu-project.org/repos/icu/tools/trunk/scripts/icutrim.py") |
|||
|
|||
parser.add_argument("-P","--tool-path", |
|||
action="store", |
|||
dest="toolpath", |
|||
help="set the prefix directory for ICU tools") |
|||
|
|||
parser.add_argument("-D","--input-file", |
|||
action="store", |
|||
dest="datfile", |
|||
help="input data file (icudt__.dat)", |
|||
required=True) |
|||
|
|||
parser.add_argument("-F","--filter-file", |
|||
action="store", |
|||
dest="filterfile", |
|||
help="filter file (JSON format)", |
|||
required=True) |
|||
|
|||
parser.add_argument("-T","--tmp-dir", |
|||
action="store", |
|||
dest="tmpdir", |
|||
help="working directory.", |
|||
required=True) |
|||
|
|||
parser.add_argument("--delete-tmp", |
|||
action="count", |
|||
dest="deltmpdir", |
|||
help="delete working directory.", |
|||
default=0) |
|||
|
|||
parser.add_argument("-O","--outfile", |
|||
action="store", |
|||
dest="outfile", |
|||
help="outfile (NOT a full path)", |
|||
required=True) |
|||
|
|||
parser.add_argument("-v","--verbose", |
|||
action="count", |
|||
default=0) |
|||
|
|||
parser.add_argument('-e', '--endian', action='store', dest='endian', help='endian, big, little or host, your default is "%s".' % endian, default=endian, metavar='endianness') |
|||
|
|||
|
|||
args = parser.parse_args() |
|||
|
|||
if args.verbose>0: |
|||
print "Options: "+str(args) |
|||
|
|||
if (os.path.isdir(args.tmpdir) and args.deltmpdir): |
|||
if args.verbose>1: |
|||
print "Deleting tmp dir %s.." % (args.tmpdir) |
|||
shutil.rmtree(args.tmpdir) |
|||
|
|||
if not (os.path.isdir(args.tmpdir)): |
|||
os.mkdir(args.tmpdir) |
|||
else: |
|||
print "Please delete tmpdir %s before beginning." % args.tmpdir |
|||
sys.exit(1) |
|||
|
|||
if args.endian not in ("big","little","host"): |
|||
print "Unknown endianness: %s" % args.endian |
|||
sys.exit(1) |
|||
|
|||
if args.endian is "host": |
|||
args.endian = endian |
|||
|
|||
if not os.path.isdir(args.tmpdir): |
|||
print "Error, tmpdir not a directory: %s" % (args.tmpdir) |
|||
sys.exit(1) |
|||
|
|||
if not os.path.isfile(args.filterfile): |
|||
print "Filterfile doesn't exist: %s" % (args.filterfile) |
|||
sys.exit(1) |
|||
|
|||
if not os.path.isfile(args.datfile): |
|||
print "Datfile doesn't exist: %s" % (args.datfile) |
|||
sys.exit(1) |
|||
|
|||
if not args.datfile.endswith(".dat"): |
|||
print "Datfile doesn't end with .dat: %s" % (args.datfile) |
|||
sys.exit(1) |
|||
|
|||
outfile = os.path.join(args.tmpdir, args.outfile) |
|||
|
|||
if os.path.isfile(outfile): |
|||
print "Error, output file does exist: %s" % (outfile) |
|||
sys.exit(1) |
|||
|
|||
if not args.outfile.endswith(".dat"): |
|||
print "Outfile doesn't end with .dat: %s" % (args.outfile) |
|||
sys.exit(1) |
|||
|
|||
dataname=args.outfile[0:-4] |
|||
|
|||
|
|||
## TODO: need to improve this. Quotes, etc. |
|||
def runcmd(tool, cmd, doContinue=False): |
|||
if(args.toolpath): |
|||
cmd = os.path.join(args.toolpath, tool) + " " + cmd |
|||
else: |
|||
cmd = tool + " " + cmd |
|||
|
|||
if(args.verbose>4): |
|||
print "# " + cmd |
|||
|
|||
rc = os.system(cmd) |
|||
if rc is not 0 and not doContinue: |
|||
print "FAILED: %s" % cmd |
|||
sys.exit(1) |
|||
return rc |
|||
|
|||
## STEP 0 - read in json config |
|||
fi= open(args.filterfile, "rb") |
|||
config=json.load(fi) |
|||
fi.close() |
|||
|
|||
if (args.verbose > 6): |
|||
print config |
|||
|
|||
if(config.has_key("comment")): |
|||
print "%s: %s" % (args.filterfile, config["comment"]) |
|||
|
|||
## STEP 1 - copy the data file, swapping endianness |
|||
endian_letter = "l" |
|||
|
|||
|
|||
runcmd("icupkg", "-t%s %s %s""" % (endian_letter, args.datfile, outfile)) |
|||
|
|||
## STEP 2 - get listing |
|||
listfile = os.path.join(args.tmpdir,"icudata.lst") |
|||
runcmd("icupkg", "-l %s > %s""" % (outfile, listfile)) |
|||
|
|||
fi = open(listfile, 'rb') |
|||
items = fi.readlines() |
|||
items = [items[i].strip() for i in range(len(items))] |
|||
fi.close() |
|||
|
|||
itemset = set(items) |
|||
|
|||
if (args.verbose>1): |
|||
print "input file: %d items" % (len(items)) |
|||
|
|||
# list of all trees |
|||
trees = {} |
|||
RES_INDX = "res_index.res" |
|||
remove = None |
|||
# remove - always remove these |
|||
if config.has_key("remove"): |
|||
remove = set(config["remove"]) |
|||
else: |
|||
remove = set() |
|||
|
|||
# keep - always keep these |
|||
if config.has_key("keep"): |
|||
keep = set(config["keep"]) |
|||
else: |
|||
keep = set() |
|||
|
|||
def queueForRemoval(tree): |
|||
global remove |
|||
if not config.has_key("trees"): |
|||
# no config |
|||
return |
|||
if not config["trees"].has_key(tree): |
|||
return |
|||
mytree = trees[tree] |
|||
if(args.verbose>0): |
|||
print "* %s: %d items" % (tree, len(mytree["locs"])) |
|||
# do varible substitution for this tree here |
|||
if type(config["trees"][tree]) == str or type(config["trees"][tree]) == unicode: |
|||
treeStr = config["trees"][tree] |
|||
if(args.verbose>5): |
|||
print " Substituting $%s for tree %s" % (treeStr, tree) |
|||
if(not config.has_key("variables") or not config["variables"].has_key(treeStr)): |
|||
print " ERROR: no variable: variables.%s for tree %s" % (treeStr, tree) |
|||
sys.exit(1) |
|||
config["trees"][tree] = config["variables"][treeStr] |
|||
myconfig = config["trees"][tree] |
|||
if(args.verbose>4): |
|||
print " Config: %s" % (myconfig) |
|||
# Process this tree |
|||
if(len(myconfig)==0 or len(mytree["locs"])==0): |
|||
if(args.verbose>2): |
|||
print " No processing for %s - skipping" % (tree) |
|||
else: |
|||
only = None |
|||
if myconfig.has_key("only"): |
|||
only = set(myconfig["only"]) |
|||
if (len(only)==0) and (mytree["treeprefix"] != ""): |
|||
thePool = "%spool.res" % (mytree["treeprefix"]) |
|||
if (thePool in itemset): |
|||
if(args.verbose>0): |
|||
print "Removing %s because tree %s is empty." % (thePool, tree) |
|||
remove.add(thePool) |
|||
else: |
|||
print "tree %s - no ONLY" |
|||
for l in range(len(mytree["locs"])): |
|||
loc = mytree["locs"][l] |
|||
if (only is not None) and not loc in only: |
|||
# REMOVE loc |
|||
toRemove = "%s%s%s" % (mytree["treeprefix"], loc, mytree["extension"]) |
|||
if(args.verbose>6): |
|||
print "Queueing for removal: %s" % toRemove |
|||
remove.add(toRemove) |
|||
|
|||
def addTreeByType(tree, mytree): |
|||
if(args.verbose>1): |
|||
print "(considering %s): %s" % (tree, mytree) |
|||
trees[tree] = mytree |
|||
mytree["locs"]=[] |
|||
for i in range(len(items)): |
|||
item = items[i] |
|||
if item.startswith(mytree["treeprefix"]) and item.endswith(mytree["extension"]): |
|||
mytree["locs"].append(item[len(mytree["treeprefix"]):-4]) |
|||
# now, process |
|||
queueForRemoval(tree) |
|||
|
|||
addTreeByType("converters",{"treeprefix":"", "extension":".cnv"}) |
|||
addTreeByType("stringprep",{"treeprefix":"", "extension":".spp"}) |
|||
addTreeByType("translit",{"treeprefix":"translit/", "extension":".res"}) |
|||
addTreeByType("brkfiles",{"treeprefix":"brkitr/", "extension":".brk"}) |
|||
addTreeByType("brkdict",{"treeprefix":"brkitr/", "extension":"dict"}) |
|||
addTreeByType("confusables",{"treeprefix":"", "extension":".cfu"}) |
|||
|
|||
for i in range(len(items)): |
|||
item = items[i] |
|||
if item.endswith(RES_INDX): |
|||
treeprefix = item[0:item.rindex(RES_INDX)] |
|||
tree = None |
|||
if treeprefix == "": |
|||
tree = "ROOT" |
|||
else: |
|||
tree = treeprefix[0:-1] |
|||
if(args.verbose>6): |
|||
print "procesing %s" % (tree) |
|||
trees[tree] = { "extension": ".res", "treeprefix": treeprefix, "hasIndex": True } |
|||
# read in the resource list for the tree |
|||
treelistfile = os.path.join(args.tmpdir,"%s.lst" % tree) |
|||
runcmd("iculslocs", "-i %s -N %s -T %s -l > %s" % (outfile, dataname, tree, treelistfile)) |
|||
fi = open(treelistfile, 'rb') |
|||
treeitems = fi.readlines() |
|||
trees[tree]["locs"] = [treeitems[i].strip() for i in range(len(treeitems))] |
|||
fi.close() |
|||
if(not config.has_key("trees") or not config["trees"].has_key(tree)): |
|||
print " Warning: filter file %s does not mention trees.%s - will be kept as-is" % (args.filterfile, tree) |
|||
else: |
|||
queueForRemoval(tree) |
|||
|
|||
def removeList(count=0): |
|||
# don't allow "keep" items to creep in here. |
|||
global remove |
|||
remove = remove - keep |
|||
if(count > 10): |
|||
print "Giving up - %dth attempt at removal." % count |
|||
sys.exit(1) |
|||
if(args.verbose>1): |
|||
print "%d items to remove - try #%d" % (len(remove),count) |
|||
if(len(remove)>0): |
|||
oldcount = len(remove) |
|||
hackerrfile=os.path.join(args.tmpdir, "REMOVE.err") |
|||
removefile = os.path.join(args.tmpdir, "REMOVE.lst") |
|||
fi = open(removefile, 'wb') |
|||
for i in remove: |
|||
print >>fi, i |
|||
fi.close() |
|||
rc = runcmd("icupkg","-r %s %s 2> %s" % (removefile,outfile,hackerrfile),True) |
|||
if rc is not 0: |
|||
if(args.verbose>5): |
|||
print "## Damage control, trying to parse stderr from icupkg.." |
|||
fi = open(hackerrfile, 'rb') |
|||
erritems = fi.readlines() |
|||
fi.close() |
|||
#Item zone/zh_Hant_TW.res depends on missing item zone/zh_Hant.res |
|||
pat = re.compile("""^Item ([^ ]+) depends on missing item ([^ ]+).*""") |
|||
for i in range(len(erritems)): |
|||
line = erritems[i].strip() |
|||
m = pat.match(line) |
|||
if m: |
|||
toDelete = m.group(1) |
|||
if(args.verbose > 5): |
|||
print "<< %s added to delete" % toDelete |
|||
remove.add(toDelete) |
|||
else: |
|||
print "ERROR: could not match errline: %s" % line |
|||
sys.exit(1) |
|||
if(args.verbose > 5): |
|||
print " now %d items to remove" % len(remove) |
|||
if(oldcount == len(remove)): |
|||
print " ERROR: could not add any mor eitems to remove. Fail." |
|||
sys.exit(1) |
|||
removeList(count+1) |
|||
|
|||
# fire it up |
|||
removeList(1) |
|||
|
|||
# now, fixup res_index, one at a time |
|||
for tree in trees: |
|||
# skip trees that don't have res_index |
|||
if not trees[tree].has_key("hasIndex"): |
|||
continue |
|||
treebunddir = args.tmpdir |
|||
if(trees[tree]["treeprefix"]): |
|||
treebunddir = os.path.join(treebunddir, trees[tree]["treeprefix"]) |
|||
if not (os.path.isdir(treebunddir)): |
|||
os.mkdir(treebunddir) |
|||
treebundres = os.path.join(treebunddir,RES_INDX) |
|||
treebundtxt = "%s.txt" % (treebundres[0:-4]) |
|||
runcmd("iculslocs", "-i %s -N %s -T %s -b %s" % (outfile, dataname, tree, treebundtxt)) |
|||
runcmd("genrb","-d %s -s %s res_index.txt" % (treebunddir, treebunddir)) |
|||
runcmd("icupkg","-s %s -a %s%s %s" % (args.tmpdir, trees[tree]["treeprefix"], RES_INDX, outfile)) |
@ -0,0 +1,18 @@ |
|||
/*
|
|||
********************************************************************** |
|||
* Copyright (C) 2014, International Business Machines |
|||
* Corporation and others. All Rights Reserved. |
|||
********************************************************************** |
|||
* |
|||
*/ |
|||
|
|||
//
|
|||
// ICU needs the C++, not the C linker to be used, even if the main function
|
|||
// is in C.
|
|||
//
|
|||
// This is a dummy function just to get gyp to compile some internal
|
|||
// tools as C++.
|
|||
//
|
|||
// It should not appear in production node binaries.
|
|||
|
|||
extern void icu_dummy_cxx() {} |
Loading…
Reference in new issue