1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
|
NOTES ON TREESIT_RECORD_CHANGE
It is vital that Emacs informs tree-sitter of every change made to the
buffer, lest tree-sitter's parse tree would be corrupted/out of sync.
Almost all buffer changes in Emacs are made through functions in
insdel.c (see below for exceptions), I augmented functions in insdel.c
with calls to treesit_record_change. Below is a manifest of all the
relevant functions in insdel.c as of Emacs 29:
Function Calls
----------------------------------------------------------------------
copy_text (*1)
insert insert_1_both
insert_and_inherit insert_1_both
insert_char insert
insert_string insert
insert_before_markers insert_1_both
insert_before_markers_and_inherit insert_1_both
insert_1_both treesit_record_change
insert_from_string insert_from_string_1
insert_from_string_before_markers insert_from_string_1
insert_from_string_1 treesit_record_change
insert_from_gap_1 treesit_record_change
insert_from_gap insert_from_gap_1
insert_from_buffer treesit_record_change
insert_from_buffer_1 (used by insert_from_buffer) (*2)
replace_range treesit_record_change
replace_range_2 (caller needs to call treesit_r_c)
del_range del_range_1
del_range_1 del_range_2
del_range_byte del_range_2
del_range_both del_range_2
del_range_2 treesit_record_change
(*1) This functions is used only to copy from string to string when
used outside of insdel.c, and when used inside insdel.c, the caller
calls treesit_record_change.
(*2) This function is a static function, and insert_from_buffer is its
only caller. So it should be fine to call treesit_record_change in
insert_from_buffer but not insert_from_buffer_1. I also left a
reminder comment.
EXCEPTIONS
There are a couple of functions that replaces characters in-place
rather than insert/delete. They are in casefiddle.c and editfns.c.
In casefiddle.c, do_casify_unibyte_region and
do_casify_multibyte_region modifies buffer, but they are static
functions and are called by casify_region, which calls
treesit_record_change. Other higher-level functions calls
casify_region to do the work.
In editfns.c, subst-char-in-region and translate-region-internal might
replace characters in-place, I made them to call
treesit_record_change. transpose-regions uses memcpy to move text
around, it calls treesit_record_change too.
I found these exceptions by grepping for signal_after_change and
checking each caller manually. Below is all the result as of Emacs 29
and some comment for each one. Readers can use
(highlight-regexp "^[^[:space:]]+?\\.c:[[:digit:]]+:[^z-a]+?$" 'highlight)
to make things easier to read.
grep [...] --color=auto -i --directories=skip -nH --null -e signal_after_change *.c
callproc.c:789: calling prepare_to_modify_buffer and signal_after_change.
callproc.c:793: is one call to signal_after_change in each of the
callproc.c:800: signal_after_change hasn't. A continue statement
callproc.c:804: again, and this time signal_after_change gets called,
Not code.
callproc.c:820: signal_after_change (PT - nread, 0, nread);
callproc.c:863: signal_after_change (PT - process_coding.produced_char,
Both are called in call-process. I don’t think we’ll ever use
tree-sitter in call-process’s stdio buffer, right? I didn’t check
line-by-line, but it seems to only use insert_1_both and del_range_2.
casefiddle.c:558: signal_after_change (start, end - start - added, end - start);
Called in casify-region, calls treesit_record_change.
decompress.c:195: signal_after_change (data->orig, data->start - data->orig,
Called in unwind_decompress, uses del_range_2, insdel function.
decompress.c:334: signal_after_change (istart, iend - istart, unwind_data.nbytes);
Called in zlib-decompress-region, uses del_range_2, insdel function.
editfns.c:2139: signal_after_change (BEGV, size_a, ZV - BEGV);
Called in replace-buffer-contents, which calls del_range and
Finsert_buffer_substring, both are ok.
editfns.c:2416: signal_after_change (changed,
Called in subst-char-in-region, which either calls replace_range (a
insdel function) or modifies buffer content by itself (need to call
treesit_record_change).
editfns.c:2544: /* Reload as signal_after_change in last iteration may GC. */
Not code.
editfns.c:2604: signal_after_change (pos, 1, 1);
Called in translate-region-internal, which has three cases:
if (nc != oc && nc >= 0) {
if (len != str_len) {
replace_range()
} else {
while (str_len-- > 0)
*p++ = *str++;
}
}
else if (nc < 0) {
replace_range()
}
replace_range is ok, but in the case where it manually modifies buffer
content, it needs to call treesit_record_change.
editfns.c:4779: signal_after_change (start1, end2 - start1, end2 - start1);
Called in transpose-regions. It just uses memcpy’s and doesn’t use
insdel functions; needs to call treesit_record_change.
fileio.c:4825: signal_after_change (PT, 0, inserted);
Called in insert_file_contents. Uses insert_1_both (very first in the
function); del_range_1 and del_range_byte (the optimized way to
implement replace when decoding isn’t needed); del_range_byte and
insert_from_buffer (the optimized way used when decoding is needed);
decode_coding_gap or insert_from_gap_1 (I’m not sure the condition for
this, but anyway it’s safe). The function also calls memcpy and
memmove, but they are irrelevant: memcpy is used for decoding, and
memmove is moving stuff inside the gap for decode_coding_gap.
I’d love someone to verify this function, since it’s so complicated
and large, but from what I can tell it’s safe.
fns.c:3998: signal_after_change (XFIXNAT (beg), 0, inserted_chars);
Called in base64-decode-region, uses insert_1_both and del_range_both,
safe.
insdel.c:681: signal_after_change (opoint, 0, len);
insdel.c:696: signal_after_change (opoint, 0, len);
insdel.c:741: signal_after_change (opoint, 0, len);
insdel.c:757: signal_after_change (opoint, 0, len);
insdel.c:976: signal_after_change (opoint, 0, PT - opoint);
insdel.c:996: signal_after_change (opoint, 0, PT - opoint);
insdel.c:1187: signal_after_change (opoint, 0, PT - opoint);
insdel.c:1412: signal_after_change. */
insdel.c:1585: signal_after_change (from, nchars_del, GPT - from);
insdel.c:1600: prepare_to_modify_buffer and never call signal_after_change.
insdel.c:1603: region once. Apart from signal_after_change, any caller of this
insdel.c:1747: signal_after_change (from, to - from, 0);
insdel.c:1789: signal_after_change (from, to - from, 0);
insdel.c:1833: signal_after_change (from, to - from, 0);
insdel.c:2223:signal_after_change (ptrdiff_t charpos, ptrdiff_t lendel, ptrdiff_t lenins)
insdel.c:2396: signal_after_change (begpos, endpos - begpos - change, endpos - begpos);
I’ve checked all insdel functions. We can assume insdel functions are
all safe.
json.c:790: signal_after_change (PT, 0, inserted);
Called in json-insert, calls either decode_coding_gap or
insert_from_gap_1, both are safe. Calls memmove but it’s for
decode_coding_gap.
keymap.c:2873: /* Insert calls signal_after_change which may GC. */
Not code.
print.c:219: signal_after_change (PT - print_buffer.pos, 0, print_buffer.pos);
Called in print_finish, calls copy_text and insert_1_both, safe.
process.c:6365: process buffer is changed in the signal_after_change above.
search.c:2763: (see signal_before_change and signal_after_change). Try to error
Not code.
search.c:2777: signal_after_change (sub_start, sub_end - sub_start, SCHARS (newtext));
Called in replace_match. Calls replace_range, upcase-region,
upcase-initials-region (both calls casify_region in the end), safe.
Calls memcpy but it’s for string manipulation.
textprop.c:1261: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start),
textprop.c:1272: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start),
textprop.c:1283: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start),
textprop.c:1458: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start),
textprop.c:1652: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start),
textprop.c:1661: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start),
textprop.c:1672: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start),
textprop.c:1750: before changes are made and signal_after_change when we are done.
textprop.c:1752: and call signal_after_change before returning if MODIFIED. */
textprop.c:1764: signal_after_change (XFIXNUM (start),
textprop.c:1778: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start),
textprop.c:1791: signal_after_change (XFIXNUM (start), XFIXNUM (end) - XFIXNUM (start),
textprop.c:1810: signal_after_change (XFIXNUM (start),
We don’t care about text property changes.
Grep finished with 51 matches found at Wed Jun 28 15:12:23
|