Commit e7e4ecc
Auto merge of #30740 - bluss:ascii-is-the-best, r=brson
Add fast path for ASCII in UTF-8 validation
This speeds up the ASCII case (and long stretches of ASCII in otherwise
mixed UTF-8 data) when checking UTF-8 validity.
Benchmark results suggest that on purely ASCII input, we can improve
throughput (megabytes verified / second) by a factor of 13 to 14 (smallish input).
On XML and mostly English language input (en.wikipedia XML dump),
throughput improves by a factor 7 (large input).
On mostly non-ASCII input, performance increases slightly or is the
same.
The UTF-8 validation is rewritten to use indexed access; since all
access is preceded by a (mandatory for validation) length check, bounds
checks are statically elided by LLVM and this formulation is in fact the best
for performance. A previous version had losses due to slice to iterator
conversions.
A large credit to Björn Steinbrink who improved this patch immensely,
writing this second version.
Benchmark results on x86-64 (Sandy Bridge) compiled with -C opt-level=3.
Old code is `regular`, this PR is called `fast`.
Datasets:
- `ascii` is just ASCII (2.5 kB)
- `cyr` is cyrillic script with ascii spaces (5 kB)
- `dewik10` is 10MB of a de.wikipedia XML dump
- `enwik8` is 100MB of an en.wikipedia XML dump
- `jawik10` is 10MB of a ja.wikipedia XML dump
```
test from_utf8_ascii_fast ... bench: 140 ns/iter (+/- 4) = 18221 MB/s
test from_utf8_ascii_regular ... bench: 1,932 ns/iter (+/- 19) = 1320 MB/s
test from_utf8_cyr_fast ... bench: 10,025 ns/iter (+/- 245) = 511 MB/s
test from_utf8_cyr_regular ... bench: 10,944 ns/iter (+/- 795) = 468 MB/s
test from_utf8_dewik10_fast ... bench: 6,017,909 ns/iter (+/- 105,755) = 1740 MB/s
test from_utf8_dewik10_regular ... bench: 11,669,493 ns/iter (+/- 264,045) = 891 MB/s
test from_utf8_enwik8_fast ... bench: 14,085,692 ns/iter (+/- 1,643,316) = 7000 MB/s
test from_utf8_enwik8_regular ... bench: 93,657,410 ns/iter (+/- 5,353,353) = 1000 MB/s
test from_utf8_jawik10_fast ... bench: 29,154,073 ns/iter (+/- 4,659,534) = 340 MB/s
test from_utf8_jawik10_regular ... bench: 29,112,917 ns/iter (+/- 2,475,123) = 340 MB/s
```
Co-authored-by: Björn Steinbrink <[email protected]>2 files changed
+72
-27
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
479 | 479 | | |
480 | 480 | | |
481 | 481 | | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
482 | 494 | | |
483 | 495 | | |
484 | 496 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
240 | 241 | | |
241 | 242 | | |
242 | 243 | | |
243 | | - | |
| 244 | + | |
244 | 245 | | |
245 | 246 | | |
246 | 247 | | |
| |||
1074 | 1075 | | |
1075 | 1076 | | |
1076 | 1077 | | |
1077 | | - | |
| 1078 | + | |
1078 | 1079 | | |
1079 | 1080 | | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
1080 | 1090 | | |
1081 | 1091 | | |
1082 | 1092 | | |
1083 | 1093 | | |
1084 | 1094 | | |
1085 | | - | |
1086 | | - | |
1087 | | - | |
1088 | | - | |
1089 | | - | |
1090 | | - | |
1091 | | - | |
1092 | | - | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
1093 | 1100 | | |
1094 | | - | |
1095 | 1101 | | |
1096 | | - | |
| 1102 | + | |
1097 | 1103 | | |
1098 | 1104 | | |
1099 | 1105 | | |
1100 | | - | |
1101 | | - | |
1102 | | - | |
1103 | | - | |
1104 | | - | |
| 1106 | + | |
| 1107 | + | |
| 1108 | + | |
| 1109 | + | |
| 1110 | + | |
1105 | 1111 | | |
1106 | | - | |
1107 | | - | |
1108 | | - | |
1109 | | - | |
1110 | | - | |
1111 | | - | |
1112 | | - | |
1113 | | - | |
| 1112 | + | |
| 1113 | + | |
1114 | 1114 | | |
1115 | | - | |
1116 | | - | |
| 1115 | + | |
1117 | 1116 | | |
1118 | 1117 | | |
1119 | 1118 | | |
| |||
1156 | 1155 | | |
1157 | 1156 | | |
1158 | 1157 | | |
| 1158 | + | |
| 1159 | + | |
| 1160 | + | |
| 1161 | + | |
| 1162 | + | |
| 1163 | + | |
| 1164 | + | |
| 1165 | + | |
| 1166 | + | |
| 1167 | + | |
| 1168 | + | |
| 1169 | + | |
| 1170 | + | |
| 1171 | + | |
| 1172 | + | |
| 1173 | + | |
| 1174 | + | |
| 1175 | + | |
| 1176 | + | |
| 1177 | + | |
| 1178 | + | |
| 1179 | + | |
| 1180 | + | |
| 1181 | + | |
| 1182 | + | |
| 1183 | + | |
| 1184 | + | |
| 1185 | + | |
| 1186 | + | |
| 1187 | + | |
| 1188 | + | |
| 1189 | + | |
1159 | 1190 | | |
1160 | 1191 | | |
| 1192 | + | |
| 1193 | + | |
1161 | 1194 | | |
1162 | 1195 | | |
1163 | 1196 | | |
| |||
0 commit comments