| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
charset_arg_parents and add charset_arg_subset and
charset_arg_superset,.
(enum charset_attr_index): Delete charset_parents and add
charset_subset and charset_superset.
(enum charset_method): Delete CHARSET_METHOD_INHERIT and add
CHARSET_METHOD_SUBSET and CHARSET_METHOD_SUPERSET.
(CHARSET_ATTR_PARENTS, CHARSET_PARENTS): Macros deleted.
(CHARSET_ATTR_SUBSET, CHARSET_ATTR_SUPERSET, CHARSET_SUBSET,
CHARSET_SUPERSET): New macros.
(charset_work): Extern it.
(ENCODE_CHAR): Use charset_work.
(CHAR_CHARSET_P): Adjusted for the change of encoder format.
(map_charset_chars): Extern it.
|
| | |
| | |
| | |
| | | |
charset_jisx0208): Extern them.
|
| | |
| | |
| | |
| | |
| | | |
charset_arg_max_code.
(struct charset): New member char_index_offset.
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | | |
sequence handling codes are moved to character.c.
|
| |/
|/| |
|
| | |
|
| |
| |
| |
| | |
(UNIBYTE_STR_AS_MULTIBYTE_P): Check more rigidly.
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
(parse_str_as_multibyte): Declarations updated.
(FETCH_STRING_CHAR_ADVANCE):
(FETCH_STRING_CHAR_ADVANCE_NO_CHECK): Use const for pointer to
lisp string data.
|
| |
| |
| |
| | |
instead of XSTRING()->size_byte.
|
| |
| |
| |
| |
| | |
FETCH_STRING_CHAR_ADVANCE_NO_CHECK): Use SDATA when getting
address of string contents.
|
| |
| |
| |
| |
| | |
SCHARS, SBYTES, STRING_INTERVALS, SREF, SDATA; explicit size_byte references
left unchanged for now.
|
|/
|
|
| |
Bound the search with MAX_MULTIBYTE_LENGTH to avoid pathological case.
|
|
|
|
| |
value to prevent gcc warnings.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
(MIN_CHAR_OFFICIAL_DIMENSION1): Define it as ((0x81 - 0x70) << 7).
|
| |
|
| |
|
| |
|
|
|
|
|
| |
(UNIBYTE_STR_AS_MULTIBYTE_P): Fix for an invalid multibyte
sequence.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CHARSET_8_BIT_GRAPHIC): New macros.
(SINGLE_BYTE_CHAR_P): Make it faster by using casting.
(CHARSET_ISO_GRAPHIC_PLANE): Use XINT instead of XFASTINT.
(CHARSET_REVERSE_CHARSET): Likewise.
(CHARSET_VALID_P): Handle new charsets; eight-bit-control and
eight-bit-graphic.
(BYTES_BY_CHAR_HEAD, WIDTH_BY_CHAR_HEAD): Optimize for ASCII.
(CHAR_CHARSET, MAKE_CHAR, SPLIT_CHAR, CHAR_BYTES): Likewise.
(PARSE_MULTIBYTE_SEQ) [BYTE_COMBINING_DEBUG]: Abort if we
encounter an invalid multibyte sequence.
(PARSE_MULTIBYTE_SEQ) [not BYTE_COMBINING_DEBUG]: Assume multibyte
sequence is always valid.
(MAKE_NON_ASCII_CHAR, SPLIT_NON_ASCII_CHAR): These macros Deleted.
(UNIBYTE_STR_AS_MULTIBYTE_P, MULTIBYTE_STR_AS_UNIBYTE_P): New
macros.
(CHAR_STRING): For 8-bit characters, call char_to_string.
(INC_POS) [not BYTE_COMBINING_DEBUG]: Faster version. Assume
multibyte sequence is always valid.
(BUF_INC_POS) [not BYTE_COMBINING_DEBUG]: Likewise.
(parse_str_as_multibyte, str_as_multibyte, str_to_multibyte,
str_as_unibyte): Extern them.
(BCOPY_SHORT): Fix a bug.
(CHAR_LEN): This macro deleted. Callers changed to use
CHAR_BYTES.
(FETCH_STRING_CHAR_ADVANCE): Check multibyteness of STRING.
(FETCH_STRING_CHAR_ADVANCE_NO_CHECK): New macro.
(FETCH_CHAR_ADVANCE): Check multibyteness of the current buffer.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(RE_MULTIBYTE_P, RE_STRING_CHAR_AND_LENGTH): New macros.
(GET_CHAR_BEFORE_2): Moved from charset.h plus fixed minor bug when
we are between str1 and str2.
(MAX_MULTIBYTE_LENGTH, CHAR_STRING) [!emacs]: Provide trivial default.
(PATFETCH): Use `TRANSLATE'.
(PATFETCH_RAW): Fetch multibyte char if applicable.
(PATUNFETCH): Remove.
(regex_compile): Rely on PATFETCH to do most of the multibyte magic.
When writing a char, write it directly into the pattern buffer rather
than going needlessly through a temp char-array.
(re_match_2_internal): Similarly, rely on RE_STRING_CHAR to do the
multibyte magic and remove the useless `#ifdef emacs'.
(bcmp_translate): Don't compare as multibyte chars when in a unibyte
buffer.
* regex.h (struct re_pattern_buffer): Make field `multibyte'
conditional on `emacs'.
* charset.h (GET_CHAR_BEFORE_2): Moved to regex.c.
|
|
|
|
|
|
|
|
|
|
|
| |
string char type. It's `const unsigned char' to match the rest of Emacs.
Consistently make sure all pointers to strings use it and make sure all
pointers into the pattern use `unsigned char'.
(re_match_2_internal): Use `PREFETCH+STRING_CHAR' instead of
GET_CHAR_AFTER_2.
Also merge wordbound and notwordbound to reduce code duplication.
* charset.h (GET_CHAR_AFTER_2): Remove.
(GET_CHAR_BEFORE_2): Use unsigned chars, like everywhere else.
|
|
|
|
| |
of GLYPH_MASK_CHAR.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
composite character is deleted.
(LEADING_CODE_COMPOSITION) (CHARSET_COMPOSITION)
(charset_composition) (MIN_CHAR_COMPOSITION)
(MAX_CHAR_COMPOSITION) (GENERIC_COMPOSITION_CHAR)
(COMPOSITE_CHAR_P) (MAKE_COMPOSITE_CHAR) (COMPOSITE_CHAR_ID)
(PARSE_COMPOSITE_SEQ) (PARSE_CHARACTER_SEQ): Deleted.
(MAX_CHAR) (CHARSET_VALID_P) (CHARSET_DEFINED_P) (CHARSET_AT)
(FIRST_CHARSET_AT) (SAME_CHARSET_P) (MAKE_NON_ASCII_CHAR)
(PARSE_MULTIBYTE_SEQ) (SPLIT_NON_ASCII_CHAR) (CHAR_PRINTABLE_P):
Modified.
(SPLIT_STRING): Call split_string, not split_non_ascii_string.
(CHAR_STRING): Delete WORKBUF argument. Call char_string, not
non_ascii_char_to_string.
(STRING_CHAR): Call string_to_char, not string_to_non_ascii_char.
(STRING_CHAR_AND_LENGTH): Likewise.
(FETCH_CHAR_ADVANCE): New macro.
(MAX_COMPONENT_COUNT) (struct cmpchar_info): Deleted.
(MAX_MULTIBYTE_LENGTH): New macro.
(MAX_LENGTH_OF_MULTI_BYTE_FORM): Deleted.
(find_charset_in_str): Argument adjusted.
(CHAR_LEN): Modified.
|
| |
|
|
|
|
|
| |
(PARSE_MULTIBYTE_SEQ): Make it work also for ASCII string.
(STRING_CHAR_AND_CHAR_LENGTH): This macro removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
are negative.
(MAKE_CHAR): Don't set MSBs of C1 and C2 to 0.
(VALID_MULTIBYTE_CHAR_P): This macro deleted.
(PARSE_COMPOSITE_SEQ): New macro.
(PARSE_CHARACTER_SEQ): New macro.
(PARSE_MULTIBYTE_SEQ): New macro.
(CHAR_PRINTABLE_P): New macro.
(STRING_CHAR): Adjusted for the change of string_to_non_ascii_char.
(STRING_CHAR_AND_LENGTH): Likewise.
(STRING_CHAR_AND_CHAR_LENGTH): Define it as STRING_CHAR_AND_LENGTH.
(INC_POS): Use the macro PARSE_MULTIBYTE_SEQ.
(DEC_POS, BUF_INC_POS, BUF_DEC_POS): Likewise,
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
character correctly.
(STRING_CHAR): Handle an invalid charater correctly.
|